(12) INTERNATIONAL 



API^Bt 



lTion published under the patent O 



I f 752453 0 

TION treaty (PCT) 



(19) World InteDectual Property 
Organizadon 
International Bureau 

(43) International Publication Date 
26 February 2004 (26.02^004) 




PCT 



llllllllillllllllllllllllli 

(10) International Publication Number 

wo 2004/016637 Al 



(51) International Patent Classification'^: 



C07H 21/00 



(21) International Application Number: 

PCT/KR2003/001655 

(22) International Filing Date: 14 August 2003 (14.08.2003) 



(25) Filing Language: 

(26) Publication Language: 



English 
English 



(30) Priority Data: 
60/402,905 
60/403,651 



14 August 2002 (14.08.2002) US 
16 August 2002 (16.08.2002) US 



(71) Applicant (for all designated States except US)i LG LIFE 
SCIENCES LTD. [KR/KR]; LG Twin Tower. East Tower, 
20, Yoido-Dong, Youngdungpo-Ku, Seoul 150-010 (KR). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): KOH, Sang Seok 

[KR/KR]; R&D Park, LG Life Sciences Ltd., 104-1, 
Moongi-dong, Yuseong-gu, Taejeon 305-380 (KR). LIU, 
Qing [US/US]; 708 CJuince Orchard Road, Gaithers- 
burg, Maryland, MD 20878 (US). CHUNG, Hyun-Ho 
[KR/KR]; R&D Paric, LG Life Sciences Ltd., 104-1, 
Moongi-dong, Yuseong-gu, Taejeon 305-380 (KR). 
ZENG» Wen [US/US]; 708 (Juince Orchard Road, 
Gaithersbuig, Maryland, MD 20878 (US). LEE, Bog- 
man [KR/KR]; R&D LG Life Sciences Ltd., 
104-1, Moongi-dong, Yuseong-gu, Taejeon 305-380 (KR). 



SONG, Si Young [KR/KR]; College of Medicine. Yonsei 
University, 134, Sinchon-dong, Seodaemun-gu, Seoul 
120-749 (KR). 



(74) Agent: 
824-11, 
(KR). 



CHOI, Kyu Pal; Halla Classic Building 4F., 
Yeoksam-dong, Kangnam-ku, Seoul 135-080 



(81) Designated States (national): AE, AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, H, GB, GD, GE, GH, 
GM. HR, HU, ID, XL, IN. IS, JP, KE. KG, KP, KR. KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NI, NO, NZ, OM, PG, PH, PL, PT, RO, RU, SC, 
SD. SE, SG, SK, SL, SY, TJ, TM, TN, TR, TT, TZ, UA, 
UG. US, UZ, VC, VN. YU. ZA. ZM. ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM. 
KE. LS. MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian patent (AM. AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European patent (AT, BE, BG, CH, CY, CZ, DE, DK, EE, 
ES, FI, PR, GB. GR, HU, IE, IT, LU, MC, NL, PT, RO, 
SE, SI, SK, TR), OAPI patent (BP, BJ, CF, CG, CI, CM, 
GA. GN, GQ. GW. ML, MR. NE, SN, TD, TG). 

Published: 

— with international search report 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



(54) nUe: GENE FAMILIES ASSOCIATED WITH LIVER CANCER 

(57) Abstract: The invention relates generally to the changes in gene expression in hepatocellular carcinoma. The invention relates 
specifically to human genes thatcorrespond to mRNA species that are differentially expressed in cancerous liver tissneand in cancer- 
ous neoplasms compared to non-canoerous liver tissue. 



wo 2004/016637 



;T/KR2003/001655 



GENE FAMILIES ASSOCIATED WITH LIVER CANCER 

TECHNICAL. FIELD 

The invention relates generally to the changes in gene expression in liver tissue 
from cancer patients who concurrently suffer from cirrhosis or hepatitis. The invention 
specifically relates to human gene famihes which are differentially expressed in hepatic 
carcinoma tissue, compared to inflamed or cirrhotic Uver tissue, and in other malignant 
neoplasms. 

BACKGROUND ART 

Liver Disease 

Generally, hver disease is classified as a disorder that causes the Uver to 
malftmction or cease functioning all together. Cirrhosis, for example, is a group of 
chronic hver diseases in which Uver cells are damaged and then replaced with scar tissue, 
thereby decreasmg the amount of normal Uver tissue. While it is most often caused by 
alcohol abuse, patients with hepatitis mfections and other biUary diseases can also 
develop cirrhosis. Chronic hepatitis-B infection, hepatitis-C infection, and ckrhosis 
have all been shown to have strong associations with primary Uver cancer, although the 
mechanisms mvolved axe still not fuUy understood (Wu et al., (2001) Oncogene 
20:3674-3682). About 10-20% of chronic hepatitis-B infections result in primary Uver 
cancer. Other factors such as alcohol consumption, poor nutrition and aflatoxins are 
also linked to the development of primary Uver cancer and cirrhosis. 

Cirrhosis of the Uver is characterized by widespread nodules combined with 
fibrosis. Damaged or dead Uvct ceUs are replaced by fibrous scar tissue, which to leads 
to fibrosis. Liver ceUs regenerate in an abnormal pattern, producing nodules 
surrounded by fibrous tissue. The fibrosis and nodule formation cause distortion and 
blockage of the liver's structural components, causing impaired blood flow and 
biochemical function. 
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In patients displaying overt symptoms, diagnosis of cirrhosis is usually easy, but 
cirrhosis may be difficult to detect in its early stages. Subtle changes occurring in the 
early stages include red palms, red spots on the upper body that blanch, hypertrophy of 
the parotid glands, fibrosis of the tendons in the palms and gynecomastia. X-rays and 
radioactive tracer tests may be effective, but diagnosis must often be by liver biopsy. 

In contrast to the underlying pathology of cirrhosis, in primary liver cancer, hver 
cells become abnormal, grow out of control and form maUgnant tumors. This disease is 
also called hepatocellular carcinoma (HCC) or mahgnant hepatoma. Cancer that 
spreads to the Uver fi-om another part of the body as a result of metastasis is not the same 
disease. HCC is difficult to detect at an early stage because the symptoms are not 
specific. They include loss of appetite and weight, fever, fatigue and weakness. As 
the cancer progresses, pain may develop in the upper abdomen, extending to the back 
and right shoulder. Swelling or a palpable mass may also be present in the upper 
abdomen; along with jaimdice and darkened urine. When the cancer metastasizes, it 
typically targets the lungs and brain. 

Diagnosis of HCC may be made by blood tests, in particular, tests for tumor 
markers such as alpha-fetoprotein. About 50-70% of HCC patients show elevated 
levels of alpha-fetoprotein. Additional diagnostic methods include non-radioactive 
imaging (abdominal or chest x-rays, angiograms, CT scans and MRIs), Uver scans using 
radioactive materials and Uver biopsies. Treatment of HCC is often not successful, 
because detection is often too late, but methods include surgical removal of the cancer, 
chemotherapy and radiation, alone or in combination. Although HCC is not very 
common in the United States, it is very prevalent in parts of Asia and Afiica, largely due 
to the higher incidence of infection with hqjatitis viruses (http://cis.nci,nih.gov/; 
http://cancQr.med.upenn.edu/disease/UvCT/intro_Uver.html). 

Molecular Changes in Liver Disease 

Little is known about the molecular changes in Uver cells associated with the 
development and progression of Uver disease. Accordingly, there exists a need for the 
investigation of the changes in gene expression levels as well as the need for the 
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identification of new molecular markers associated with the development and progression 
of liver disease. Furthermore, if intervration is expected to be successful in halting or 
slowing down liver disease, means of accurately assessing the early manifestations of 
cirrhosis or HCC need to be established. Likewise, the development of therapeutics to 
prevent or stop the progression of liver disease relies on the identification of genes 
responsible for the cancerous transformation of liver cells and the growth of cancerous 
liver cells or the induction of tissue damage and scar fomiation associated with cirrhosis. 

DISCLOSURE OF THE INVENTION 

The present invention is based on the discovery of new gene families, each 
designated LBFL302 and LBFL303, that are differentially expressed in hepatocellular 
carcinoma (HCC), compared to liver cirrhosis (LC) or, chronic hepatitis (CH), and in 
other malignant neoplasms. The invention includes an isolated nucleic acid molecule 
selected firom the group consisting of an isolated nucleic acid molecule comprising SEQ 
ID NO: 1, 3, 5 or 7, an isolated nucleic acid molecule encoding SEQ ID NO: 6 or 8, an 
isolated nucleic acid molecule that encodes a protein that is expressed in liver cancer and 
that exhibits at least about 95% nucleotide sequence identity over the entire contiguous 
sequence of SEQ ID NO: 5, an isolated nucleic acid molecule that encodes a protein that 
is expressed in liver cancer and that exhibits at least about 75% nucleotide sequence 
identity over the entire contiguous sequence of SEQ ID NO: 7, and an isolated nucleic 
acid molecule comprising the complement of any of the aforementioned nucleic acid 
molecules. 

The present invention further includes the nucleic acid molecules operably 
liiiked to one or more expression control elements, including vectors comprising the 
isolated nucleic acid molecules. The invention further includes host cells transformed 
to contain the nucleic acid molecules of the invention and methods for producing a 
protein comprising the step of culturing a host cell transformed with a nucleic acid 
molecule of the invention under conditions in which the protein is caressed. 

The invention further provides an isolated polypeptide selected firom the group 
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consisting of an isolated polypeptide comprising the amino acid sequence of SEQ ID 
NO: 2, 4, 6 or 8, an isolated polypeptide comprising a fragment of at least 10 amino 
acids of SEQ ID NO: 2 or 4, an isolated polypeptide comprising conservative amino acid 
substitutions of SEQ ID NO: 2 or 4 and an isolated polypeptide comprising naturally 
occurring amino acid sequence variants of SEQ ID NO: 2 or 4. Polypeptides of the 
invention also include polypeptides with an amino acid sequence having at least about 
50%, 60%, 70% or 75% amino acid sequence identity with the sequence set forth in SEQ 
ID NO: 2 or 4, preferably at least about 80%, more preferably at least about 90-95%, and 
most preferably at least about 95-98% sequence identity with the sequence set forth in 
SEQ ID NO: 2 or 4, and a protein having at least about 95% amino acid sequence 
identity with SEQ ID NO: 6 or 8. 

The invention further provides an isolated antibody or antigen-binding antibody 
fragment that specifically binds to a polypeptide of the invention, including monoclonal 
and polyclonal antibodies. 

The invention ftirther provides methods of identifying an agent which modulates 
the expression of a nucleic acid molecule encoding a protein of the invention, 
comprising: exposing cells which express the nucleic acid molecule to the agent; and 
determining whether the agent modulates expression of said nucleic acid molecule, 
thereby identifying an agent which modulates the expression of a nucleic acid molecule 
encoding the protein. 

The invention further provides methods of identifying an agent which modulates 
the level of or at least one activity of a protein of the invention, comprising: exposing 
cells which express the protein to the agent; and determining whether the agent 
modulates the level of or at least one activity of said protein, thereby identifying an agent 
which modulates the level of or at least one activity of the protein. 

The invention further provides methods of identifying binding partners for a 
protein of the invention, comprising the steps of exposing said protein to a potential 
binding partner; and detemiining if the potential binding partner binds to said protein, 
thereby idratifying binding partners for the protein. 

The present invention further provides methods of modulating the expression of 
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a nucleic acid molecule encoding a protein of the invention, comprising the step of 
administering an effective amount of an agent which modulates the expression of a 
nucleic acid molecule encoding the protein. The invention also provides methods of 
modulating at least one activity of a protein of the invention, comprising the step of 

5 adnadnistering an effective amoimt of an agent which modulates at least one activity of 
the protein of the invention. 

The present invention further includes non-human traasgenic animals modified 
to contain the nucleic acid molecules of the invention, or non-hiunan transgenic animals 
modified to contain the mutated nucleic acid molecules such that expression of the 

10 encoded polypeptides of the invention is prevented. 

The present invention also mcludes non-human transgenic anirn al s in which all 
or a portion of a gene comprising all or a portion of SEQ ID NO: 1, 3, 5, 7 or 9 has been 
knocked out or deleted fi-om the genome of the animal. 

The invention further provides methods of diagnosing liver cancer and other 

15 cancers, comprising the steps of acquiring a tissue, blood, urine or other sample firom a 
subject and determining the level of expression of a nucleic acid molecule of the 
invention or polypeptide of the invention. 

The invention further includes compositions comprising a diluent and a 
polypeptide or protein selected fi-om the group consisting of an isolated polypeptide 

20 comprising the amino acid sequence of SEQ ID NO: 2, 4, 6, 8 or 10, an isolated 
polypeptide comprising a fragment of at least 10 amino acids of SEQ ID NO: 2, 4, 6, 8 or 
10, an isolated polypeptide comprising conservative amino acid substitutions of SEQ ID 
NO: 2 or 4, naturally occurring amino acid sequence variants of SEQ ID NO: 2 or 4, an 
isolated polypeptide with an amino acid sequence having at least about 50%, 60%, 70% 

25 or 75% amino acid sequence identity with the sequence set forth in SEQ ID NO: 2 or 4, 
preferably at least about 80%, more preferably at least about 90-95%, and most 
preferably at least about 95-98% sequence identity with the sequence set forth in SEQ ID 
NO: 2 or 4, and a polypeptide having at least about 95% amino acid sequence identity 
with SEQ ID NO: 6, 8 or 10. 



30 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 Figure 1 is a hydrophobicity plot of the protein encoded by the open 
reading frame of LBFL302, variant BC4 (SEQ ID NO: 2). Analysis was performed 
according to the methods of Kyte-Doolittle and Goldman et al. 

Figure 2 Figure 2 is a hydrophobicity plot of the protein encoded by the open 
reading frame of LBFL302, variant BC7 (SEQ ID NO: 4). Analysis was performed 
according to the methods of Kyte-Doolittle and Goldman et al. 

Figure 3 Figure 3 is a hydrophobicity plot of the protein encoded by the open 
reading frame of LBFL303, clone GE6 (SEQ ID NO: 6). Analysis was performed 
according to the methods of Kyte-Doolittle and Goldman et al. 

Figure 4 Figure 4 is a hydrophobicity plot of the protein encoded by the open 
reading frame of LBFL303, clone MBS (SEQ ID NO: 8). Analysis was performed 
according to the methods of Kyte-Doolittle and Goldman et aL 

Figure 5 Figure 5 is a hydrophobicity plot of the protein encoded by the open 
reading frame of LBFL303, clone IE4 (SEQ ID NO: 10). Analysis was performed 
according to the methods of K3rte-Doolittle and Goldman et aL 

BEST MODE FOR CARRYING OUT THE INVENTION 

L General Description 

The present invention is based in part on the identification of new gene families 
(LBFL302 and LBFL303) tiiat are differentially expressed in cancerous human liver 
tissue, compared to inflamed or cirrhotic human hver tissue^ and in other malignant 
neoplasms. These gene famiUes correspond to the human cDNA of SEQ ID NOS: 1 
and 3 (LBFL302), and SEQ ID NOS: 5, 7 and 9 (LBFL303), 
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The genes and proteins of the invention may be used as diagnostic agents or 
markers to detect liver cancer or to differentiate hepatic carcinoma from cirrhotic liver 
tissue in a sample. They can also serve as a target for agents that modulate gene 
expression or activity. For example, agents may be identified that modulate biological 
processes associated with tumor growth, including the hyperplastic process of liver 
cancer, 

n. Specific Embodiments 

A. The Proteins Associated with Liver Cancer 

The present invention provides isolated proteins, allelic variants of the proteins, 
and conservative amino acid substitutions of the proteins. As used herein, the "protein" 
or "polypeptide" refers, in part, to a protein that has the human amino acid sequence 
depicted in SEQ ID NO: 2, 4, 6, 8 or 10. The terms also refer to naturally occurring 
allelic variants and proteins that have a slightly different amino acid sequence than that 
specifically recited above. Allelic variants, though possessing a slightly different amino 
acid sequence than those recited above, will still have the same or similar biological 
functions associated with these proteins. 

As used herein, the family of proteins related to the human amino acid sequence 
of SEQ ID NO: 2, 4, 6, 8 or 10 refers to proteins that have been isolated from organisms 
in addition to humans. The methods used to identify and isolate other members of the 
fandly of proteins related to these proteins are described below. 

The proteins of the present invention are preferably in isolated form. As used 
herein, a protein is said to be isolated when physical, mechanical or chemical methods 
are employed to remove the protein from cellular constituents that are normally 
associated with the protein. A skilled artisan can readily employ standard purification 
methods to obtain an isolated protein. 

The proteins of the present invention further include insertion, deletion or 
conservative amino acid substitution variants of SEQ ID NO: 2, 4, 6, 8 or 10. As used 
herein, a conservative variant refers to alterations in the amino acid sequence that do not 
adversely affect the biological functions of ttie protein. A substitution, insotion or 
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deletion is said to adversely affect the protein when the altered sequence prevents or 
disrupts a biological function associated with the protein. For example, the overall 
charge, structure or hydrophobic/hydrophiUc properties of the protein, in certain 
instances, may be altered without adversely affecting a biological activity. Accordingly, 
the amino acid sequence can be altered, for example to render the peptide more 
hydrophobic or hydrophilic, without adversely affecting the biological activities of the 
protein. 

Ordinarily, the alleUc variants, the conservative substitution variants, and the 
members of the protein family encoded by LBFL302 gene, will have an amino acid 
sequence having at least about 50%, 60%, 70% or 75% amino acid sequence identity 
with the sequence set forth in SEQ ID NO: 2 or 4, more preferably at least about 80%, 
even more preferably at least about 90-95%, and most preferably at least about 99 or 
99.5% sequence identity. Further, those of the protein family encoded by LBFL303 
gene, will have an amino acid sequence having at least about 50%, 60%, 70% or 75% 
amino acid sequence identity with the sequence set forth in SEQ ID NO: 6, 8 or 10, more 
preferably at least about 80-90%, even more preferably at least about 91- 94%, and most 
preferably at least about 95% or 98% sequence identity. Identity or homology witii 
respect to such sequences is defined herein as the percentage of amino acid residues in 
the candidate sequence that are identical with SEQ ID NO: 2, 4, 6, 8 or 10, after aligning 
the sequences and introducing gaps, if necessary, to achieve the maximum percent 
homology, and not considering any conservative substitutions as part of the sequence 
identity (see section B for the relevant parameters). Fusion proteins, or N-terminal, 
C-terminal or internal extensions, deletions, or insertions into the peptide sequence shall 
not be construed as affecting homology. 

Thus, the proteins of the present invention include molecules having the amino 
acid sequence disclosed in SEQ ID NO: 2, 4, 6, 8 or 10; firagments thereof having a 
consecutive sequence of at least about 3, 4, 5, 6, 10, 15, 20, 25, 30, 35 or more amino 
acid residues of these proteins; amino acid sequence variants wherein one or more amino 
acid residues has been inserted N- or C-terminal to, or within, the disclosed coding 
sequence; and amino acid sequence variants of the disclosed sequence, or their firagments 
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as defined above, that have been substituted by at least one residue. Such firagments, 
also referred to as peptides or polypeptides, may contain antigenic regions, functional 
regions of the protein identified as regions of the amino acid sequence which correspond 
to known protein domains, as well as regions of pronounced hydrophilicity. The regions 
are all easily identifiable by using commonly available protein sequence analysis 
software such as MacVector (Oxford Molecular). 

Contemplated variants fiirther include those containing predetemiined mutations 
by, e.g., homologous recombination, site-directed or PGR mutagenesis, and the 
corresponding proteins of other animal species, including but not limited to rabbit, 
mouse, rat, porcine, bovine, ovine, equine and non-human primate species, and the 
alleles or other naturally occurring variants of the family of proteins; and derivatives 
wherein the protein has been covalently modified by substitution, chemical, enzymatic, 
or other appropriate means with a moiety other than a naturally occurring amino acid (for 
example a detectable moiety such as an enzyme or radioisotope). 

The present invention fiirther provides compositions comprising a protein or 
polypeptide of the invention and a diluent. Suitable diluents can be aqueous or non- 
aqueous solvents or a combination thereof, and can comprise additional components, for 
example water-soluble salts or glycerol, that contribute to the stability, solubility, activity, 
and/or storage of the protein or polypeptide. 

As described below, members of the family of proteins can be used: (1) to 
idartify agents which modulate the level of or at least one activity of the protein, (2) to 
identify binding partners for the protein, (3) as an antigen to raise polyclonal or 
monoclonal antibodies, (4) as a therapeutic agent or target and (5) as a diagnostic agent 
or marker of liver cancer and other hyperplastic diseases. 

B. Nucleic Acid Molecules 

The presCTit invention further provides nucleic acid molecules that encode the 
protein having SEQ ID NO: 2, 4, 6, 8 or 10 and the related proteins herein described, 
preferably in isolated form. As used herein, "nucleic acid" is defined as RNA or DNA 
that encodes a protein or peptide as defined above, is complementary to a nucleic acid 
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sequence encoding such peptides; hybridizes to the nucleic acid of SEQ ID NO: 1, 3, 5, 7 
or 9 and remains stably bound to it under appropriate stringency conditions; encodes a 
polypeptide sharing at least about 50%, 60%, 70% or 75%, preferably at least about 80%, 
more preferably at least about 85%, and most preferably at least about 90%, 95%, 98%, 

5 99%, 99-5% or more identity with the peptide sequence of SEQ ID NO: 2 or 4, or a 
polypeptide sharing at least about 50%, 60%, 70% or 75%, preferably at least about 80- 
90%, more preferably at least about 91-92%, and most preferably at least about 93%, 
95%, 98%, 99% or more identity with the peptide sequence of SEQ ID NO: 6 or 8; or 
exhibits at least 50%, 60%, 70% or 75%, preferably at least about 80%, more preferably 

10 at least about 85%, and even more preferably at least about 90%, 95%, 98%, 99%, 99.5% 
or more nucleotide sequence identity over the opoa reading frames of SEQ ID NO: 1 or 3, 
or at least 50%, 60%, 70% or 75%, preferably at least about 80-90%, more preferably at 
least about 91-92%, and even more preferably at least about 93%, 95%, 98%, 99% or 
more nucleotide sequence identity over the open reading frames of SEQ ID NO: 5, 7 or 9. 

15 The present invention ftirther includes isolated nucleic acid molecules that 

specifically hybridize to the complCTnent of SEQ ID NO: 1, 3, 5, 7 or 9, particularly 
molecules that specifically hybridize over the open reading fi:'ames. Such molecules 
that specifically hybridize to the complement of SEQ ID NO: 1, 3, 5, 7 or 9 typically do 
so imder stringent hybridization conditions. 

20 Specifically contemplated are genomic DNA, cDNA, mRNA and antisense 

molecules, as well as nucleic acids based on alternative backbones or including 
alternative bases, whether derived from natural sources or synthesized. Such 
hybridizing or complementary nucleic acids, however, are defined further as being novel 
and unobvious over any prior art nucleic acid including that which encodes, hybridizes 

25 under appropriate stringency conditions, or is complementary to nucleic acid encoding a 
protein according to the present invention. 

Homology or identity at the nucleotide or amino acid sequence level is 
determined by BLAST (Basic Local Alignment Search Tool) analysis using the 
algorithm employed by the prognuns blastp, blastn, blastx, tblastn and tblastx 

30 (Altschul et aLy (1997) Nucleic Acids Res 25:3389-3402, and Karlm et al, (1990) Proc 
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Natl Acad Sci USA 87:2264-2268, both fully incorporated by reference) which are 
tailored for sequence similarity searching. The approach used by the BLAST program is 
to first consider similar segments, with and without gaps, between a query sequence and 
a database sequence, then to evaluate tiie statistical significance of all matches that are 

5 identified and finally to smmnarize only those matches which satisfy a preselected 
threshold of significance. For a discussion of basic issues in similarity searching of 
sequence databases, see Altschul et al.y (1994) Nature Genetics 6: 119-129 which is fiilly 
incorporated by reference. The search parameters for histogram, descriptions, 
alignments, expect (i.e., the statistical significance threshold for reportmg matches 

10 against database sequences), cutoff, matrix and filter (low complexity) are at the default 
settings. The default scoring matrix used by blastp, blasts, tblastn, and tblastx is the 
BLOSUM62 matrix (HemkoS et al, (1992) Proc Natl Acad Sci USA 89:10915-10919, 
fully incorporated by reference), recommended for query sequences over 85 nucleotides 
or amino acids in length. 

15 For blastn, the scoring matrix is set by the ratios of M (i.e., the reward score for 

a pair of matching residues) to N (i.e., the penalty score for mismatching residues), 
wherein the default values for M and N are 5 and -4, respectively. Four blastn 
parameters were adjusted as follows: Q=10 (gap creation penalty); R=10 (gap extension 
penalty); wink^l (generates word hits at every wink* position along the query); and 

20 gapw=16 (sets the window width within which gapped alignments are generated). The 
equivalent Blastp parameter settings were Q=9; R=2; wink=l; and gapw=32. A Bestfit 
comparison between sequences, available in the GCG package version 10.0, uses DNA 
parameters GAP=50 (gap creation penalty) and LEN=3 (gap extension penalty) and tiie 
equivalent settings in protein comparisons are GAP=8 and LEN=2. 

25 "Stringent conditions" are those that (1) employ low ionic strength and high 

temperature for washing, for example, 0.015 M NaCl/0.0015 M sodium citrate/0.1% 
SDS at 50 "^C, or (2) employ during hybridization a denaturing agent such as formamide, 
for example, 50% (vol/vol) fqimamide with 0.1% bovine serum albumin/0.1% 
Ficoll/0.1% polyvinylpyiTolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 

30 mM NaCl, 75 mM sodium citrate at 42 ^'C. Another example is hybridization in 50% 
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fonnamide, 5x SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate 
(pH 6,8), 0.1% sodium pyrophosphate, 5x Denhardt's solution, sonicated sahnon sperm 
DNA (50 jxg/ml), 0.1% SDS, and 10% dextran sulfate at 42 ^C, with washes at 42 in 
0.2x SSC and 0.1% SDS. A skilled artisan can readily determine and vary the 

5 stringency conditions appropriately to obtain a clear and detectable hybridization signal. 
Preferred molecules are those that hybridize under the above conditions to the 
complement of SEQ ID NO: 1, 3, 5 or 7 and which encode a functional or full-length 
protein. Even more preferred hybridizing molecules are those that hybridize imder the 
above conditions to the complement strand of the open reading frame of SEQ ID NO: 1, 

10 3, 5, 7 or 9. 

As used herein, a nucleic acid molecule is said to be "isolated" when the nucleic 
acid molecule is substantially separated from contaminant nucleic acid molecules 
encoding other polypeptides. 

The present invention further provides fragments of the disclosed nucleic acid 

15 molecules. As used herein, a fragment of a nucleic acid molecule refers to a small 
portion of the coding or non-coding sequence. The size of the fragment will be 
determined by the intended use. For example, if the fragment is chosen so as to encode 
an active portion of the protein, the fragment will need to be large enough to encode the 
functional region(s) of the protein. For instance, fragments which encode peptides 

20 corresponding to predicted antigenic regions may be prepared. If the fragment is to be 
used as a nucleic acid probe or PGR primer, then the fragment length is chosen so as to 
obtain a relatively small number of false positives during probing/priming (see the 
discussion in Section H). 

Fragments of the nucleic acid molecules of the present invention (i.e., synthetic 

25 oligonucleotides) that are used as probes or specific primers for the polymerase chain 
reaction (PGR), or to syntiiesize gene sequences encoding proteins of the invention, can 
easily be synthesized by chemical techniques, for example, the phosphoramidite method 
of Matteucci et al, ((1981) J Am Chem Soc 103:3185-3191) or using automated 
synthesis methods. In addition, larger DNA segments can readily be prepared by well 

30 known methods, such as synthesis of a group of oligonucleotides that define various 
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modular segments of the gene, followed by ligation of oligonucleotides to build the 
complete modified gene. 

The nucleic acid molecules of the present invention may further be modified so 
as to contain a detectable label for diagnostic and probe purposes. A variety of such 
labels are known in the art and can readily be employed with the encoding molecules 
herein described. Suitable labels include, but are not limited to, biotin, radiolabeled or 
fluorescently labeled nucleotides and the like. A skilled artisan can readily employ any 
such label to obtain labeled variants of the nucleic acid molecules of the invention. 

C. Isolation of Other Related Nucleic Acid Molecules 

As described above, the identification and characterization of the nucleic acid 
molecule having SEQ ID NO: 1, 3, 5, 7 or 9 allows a skilled artisan to isolate nucleic 
acid molecules that encode other members of the protein family in addition to the 
sequences herein described. Further, the presently disclosed nucleic acid molecules 
allow a skilled artisan to isolate nucleic acid molecules that encode other members of the 
family of proteins in addition to the proteins having SEQ ID NO: 2, 4, 6, 8 or 10. 

For instance, a skilled artisan can readily use the amino acid sequence of SEQ 
ID NO: 2, 4, 6, 8 or 10 to generate antibody probes to screen expression libraries 
prepared from appropriate cells. Typically, polyclonal antiserum from mammals such 
as rabbits immimized with the purified protein (as described below) or monoclonal 
antibodies can be used to probe a mammalian cDNA or genomic expression library, such 
as lambda gtU library, to obtain the appropriate coding sequence for other members of 
the protein family. The cloned cDNA sequCTice can be expressed as a fusion protein, 
expressed directly using its own control sequences, or expressed by constructions using 
control sequences appropriate to the particular host used for expression of the enzyme. 

Alternatively, a portion of the coding sequmce herein described can be 
synthesized and used as a probe to retrieve DNA encoding a member of the protein 
family from any mammalian organism. Oligomers containing approximately 18-20 
nucleotides (encoding about a 6-7 amino acid stretch) are prepared and used to screen 
genomic DNA or cDNA libraries to obtain hybridization imder stringent conditions or 
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conditions of sufficient stringency to eliminate an undue level of false positives. 

Additionally, pairs of oligonucleotide primers can be prepared for use in a 
polymerase chain reaction (PGR) to selectively clone an encoding nucleic acid molecule. 
A PGR denature/anneal/extend cycle for using such PGR primers is well known in the art 
5 and can readily be adapted for use in isolating other encoding nucleic acid molecules. 

Nucleic acid molecules encoding other members of the protein fanuly may also 
be identified in existing genomic or other sequence information using any available 
computational method, including but not limited to: PSI-BLAST (Altschul et al, (1997) 
Nucleic Acids Res 25:3389-3402); PHI-BLAST (Zhang et al, (1998) Nucleic Acids Res 
10 26:3986-3990), 3D-PSSM (Kelly et al, (2000) J Afo/ Biol 299(2) :499-520); and other 
computational analysis methods (Shi et aL, (1999) Biochem Biophys Res Commun 
262(1):132-138 and Matsunami et, ah, (2000) Nature 404(6778):601-604. 

D. rDNA molecules Containing a Nucleic Acid Molecule 

15 The present invention further provides recombinant DNA molecules (rDNAs) 

that contain a coding sequence. As used herein, a rDNA molecule is a DNA molecule 
that has been subjected to molecular manipulation in situ, MeHiods for generating 
rDNA molecules are well known in the art, for example, see Sambrook et aL^ Molecular 
Cloning - A Laboratory Manual, Third Ed., Cold Spring Harbor Laboratory Press, Cold 

20 Spring Harbor, NY, 2001. In the preferred rDNA molecules, a coding DNA sequence is 
operably linked to expression control sequences and/or vector sequences. 

The choice of vector and/or expression control sequences to which one of the 
protein fanaily encoding sequences of the present invention is operably linked depends 
directly, as is well known in the art, on the functional properties desired, e.g;, protein 

25 expression, and the host cell to be transformed. A vector contCTiplated by the present 
invention is at least capable of directing the replication or insertion into the host 
chromosome, and preferably also expression, of the stmctural gene included in the rDNA 
molecule. 

Expression control elements that are used for regulating the expression of an 
30 operably linked protein encoding sequence are known in the art and include, but are not 
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limited to, inducible promoters, constitutive promoters, secretion signals, and other 
regulatory elements. Preferably, the inducible promoter is readily controlled, such as 
being responsive to a nutrient in the host cell's medium. 

In one embodiment, the vector containing a coding nucleic acid molecule will 

5 include a prokaryotic replicon, /.e., a DNA. sequence having the ability to direct 
autonomous replication and maintenance of the recombinant DNA molecule 
extrachromosomally in a prokaiyotic host cell, such as a bacterial host cell, transformed 
therewith. Such replicons are well known in the art. In addition, vectors that include 
a prokaryotic replicon may also include a gene whose expression confers a detectable 

10 marker such as a drug resistance. Typical bacterial dmg resistance genes are those that 
confer resistance to ampicillin, kanamycin, chloramphenicol or tetracycline. 

Vectors that include a prokaryotic replicon can further include a prokaryotic or 
bacteriophage promoter capable of directing the expression (transcription and 
translation) of the coding gene sequences in a bacterial host cell, such as E, coli. A 

15 promoter is an expression control element formed by a DNA sequence that permits 
binding of RNA polymerase and transcription to occur. Promoter sequences compatible 
with bacterial hosts are typically provided in plasmid vectors contaimng convenient 
restriction sites for insertion of a DNA segment of the present invention. Typical of 
such vector plasmids are pUC8, pUC9, pBR322 and pBR329 available from BioRad 

20 Laboratories, (Richmond, CA), pPL and pKK223 available from Pharmacia (Piscataway, 
NJ). 

Expression vectors compatible with eukaryotic cells, preferably those 
compatible with vertebrate cells, such as liver cells, can also be used to form rDNA 
molecules that contain a coding sequence. Eukaryotic cell expression vectors, 

25 including viral vectors, are well known in the art and are available from several 
conamercial sources. Typically, such vectors are provided containing convenient 
restriction sites for tusertion of the desired DNA segment. Typical of such vectors are 
pSVL and pKSV-10 (Pharmacia), pBPV-l/pML2d (Intemational Biotechnologies, Inc.), 
pTDTl (ATCC, #31255), the vector pCDM8 described herein, and the like eukaryotic 

30 expression vectors. Vectors may be modified to include liver cell specific promoters if 
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needed. 

Eukaryotic cell expression vectors used to construct the rDNA molecules of the 
present invention may further include a selectable marker that is effective in an 
eukaryotic cell, preferably a drug resistance selection marker. A preferred drug 
5 resistance marker is the gene whose expression residts in neomycin resistance, i.e., the 
neomycin phosphotransferase (neo) gene. (Southern et aL, (1982) J Mol Anal Genet 
1:327-341) Alternatively, the selectable marker can be present on a separate plasmid, 
and the two vectors are introduced by co-transfection of the host cell, and selected by 
culturing in the appropriate drug for the selectable marker. 

10 

E. Host Cells Containing an Exogenously Supplied Coding Nucleic Acid Molecule 

The present invention further provides host cells transformed with a nucleic acid 
molecule that encodes a protein of the present invention. The host cell can be either 
prokaryotic or eukaryotic. Eukaryotic cells useful for expression of a protein of the 

15 invention are not limited, so long as the cell line is compatible with cell culture methods 
and compatible with the propagation of flie expression vector and expression of the gene 
product. Preferred eukaryotic host cells include, but are not limited to, yeast, insect and 
mammalian cells, preferably vertebrate cells such as those from a mouse, rat, monkey or 
human cell line. Preferred eukaryotic host cells include Chinese hamster ovary (CHO) 

20 cells available from the ATCC as CCL61, NIH Swiss mouse embryo cells (NIH/3T3) 
available from the ATCC as CRL 1658, baby hamster kidney cells (BHK), and the like 
eukaryotic tissue culture cell lines. 

Any prokaryotic host can be used to express a rDNA molecule encoding a 
protein of the invention. The preferred prokaryotic host is E. colL 

25 Transformation of appropriate cell hosts with a rDNA molecule of the present 

invention is accomplished by well known methods that typically depend on the type of 
vector used and host system employed. With regard to transformation of prokaryotic 
host cells, electroporation and salt treatment methods are typically employed (see, for 
example, Cohen et a/., (1972) Proc Natl Acad Sci USA 69:2110; and Sambrook et al, 

30 supra). With regard to transformation of vertebrate cells with vectors containing 
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rDNAs, electroporation, cationic lipid or salt treatment methods are typically employed, 
see, for example, Graham et al, (1973) Virol 52:456; Wigler et a/., (1979) Proc Natl 
AcadSci USA 76;1373-1376. 

Successfully transformed cells, cells that contain a rDNA molecule of the 
present invention, can be identified by well known techniques including the selection for 
a selectable marker. For example, cells resulting from the introduction of an rDNA of 
the present invention can be cloned to produce single colonies. Cells from those 
colonies can be harvested, lysed and their DNA content examined for the presence of the 
rDNA using a method such as that described by Southern, (1975) J Mol Biol 98:503 or 
Berent et aL, (1985) Biotech 3:208, or the proteins produced from the cell assayed via an 
iromunological method. 

R Production of Recombinant Proteins using a rDNA Molecule 

The present invention further provides methods for producing a protein of the 
invention using nucleic acid molecules herein described. In general terms, the 
production of a recombinant form of a protein typically involves the following steps: 

First, a nucleic acid molecule is obtained that encodes a protein of the invention, 
such as a nucleic acid molecule comprising, consisting essentially of or consisting of 
SEQ ID NO: 1 or SEQ ID NO: 3; nucleotides 155-421 or 155-418 of SEQ ID NO: 1; 
nucleotides 139-405 or 139-402 of SEQ ID NO: 3; SEQ ID NO: 5, SEQ ID NO: 7 or 
SEQ ID NO: 9; nucleotides 32-1387 or 32-1384 of SEQ ID NO: 5; nucleotides 41-1504 
or 41-1501 of SEQ ID NO: 7; or nucleotides 31-1554 or 31-1551 of SEQ ID NO: 9. If 
the encoding sequence is unintermpted by introns, as are these open-reading-frames, it is 
directly suitable for expression in any host. 

The nucleic acid molecule is then preferably placed in operable linkage with 
suitable control sequences, as described above, to form an expression imit containing the 
protein open reading frame. The expression unit is used to transform a suitable host 
and the transformed host is cultured under conditions that allow the production of the 
recombinant protein. Optionally the recombinant protein is isolated from the mediima 
or from the cells; recovery and purification of the protein may not be necessary in some 
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instances where some impurities may be tolerated. 

Each of the foregoing steps can be done in a variety of ways. For example, the 
desired coding sequences may be obtained from genomic fragments and used directly in 
appropriate hosts. The construction of expression vectors that are operable in a variety 
of hosts is accomplished using appropriate replicons and control sequences, as set forth 
above. The control sequences, expression vectors, and transformation methods are 
dependent on the tj^e of host cell used to express the gene and were discussed in detail 
earlier. Suitable restriction sites can, if not normally available, be added to the ends of 
the coding sequence so as to provide an excisable gene to insert into these vectors. A 
skilled artisan can readily adapt any host/expression system known in the art for use with 
the nucleic acid molecules of the invention to produce recombinant protein. 

G. Methods to Identify Binding Partners 

Another embodiment of the present invention provides methods for isolating 
and identifying binding partners of proteins of the invention^ In general, a protein of 
the invention is mixed with a potential binding partner or an extract or fraction of a cell 
under conditions that allow the association of potential binding partners with the protein 
of the invention. After mixing, peptides, polyp^tides, proteins or other molecules that 
have become associated with a protein of the invention are separated from the mixture. 
The binding partner that boxmd to the proteiu of the invention can then be removed and 
further analyzed. To identify and isolate a binding partner, the entire protein, for 
instance a protein comprising the entire amino acid sequence of SEQ ID NO: 2, 4, 6, 8 or 
10 can be used. Alternatively, a fragment of the protein can be used. 

As used herein, a cellular extract refers to a preparation or fraction which is 
made from a lysed or dismpted cell. The preferred source of cellular extracts will be 
cells derived from human liver tumors or transformed liver cells, for instance, biopsy 
tissue or tissue culture cells from hepatic carcinomas. Altematively, cellular extracts 
may be prepared from normal tissue or available cell lines, particularly liver-derived cell 
Unes. 

A variety of methods can be used to obtain an extract of a cell. Cells can be 
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disrupted using either physical or chemical disruption methods. Examples of physical 
disruption methods include, but are not limited to, sonication and mechanical shearing. 
Examples of chemical lysis methods include, but are not lin[iited to, detergent lysis and 
enzyme lysis. A skilled artisan can readily adapt methods for preparing cellular 

5 extracts in order to obtain extracts for use in the present methods. 

Once an extract of a cell is prepared, the extract is mixed with the protein of the 
invention under conditions in which association of the protein with the binding partner 
can occur. A variety of conditions can be used, the most preferred being conditions that 
closely resemble conditions found in the cytoplasm of a human cell. Features such as 

10 osmolarity, pH, temperature, and the concentration of cellular extract used, can be varied 
to optimize the association of the protein with the binding partner. 

After mixing under appropriate conditions, the bound complex is separated from 
the mixture. A variety of techniques can be utilized to separate the mixture. For 
example, antibodies specific to a protein of the invention can be used to 

15 inununoprecipitate the binding partner complex. Alternatively, standard chemical 
separation techniques such as chromatography and density/sediment centrifiigation can 
be used. 

After removal of non-associated cellular constituents found in the extract, the 
binding partner can be dissociated firom the complex using conventional methods. For 
20 example, dissociation can be accomplished by altering the salt concentration or pH of the 
mixture. 

To aid in separating associated binding partner pairs from the mixed extract, the 
protein of the invention can be immobilized on a solid support. For example, the 
protein can be attached to a nitrocellulose matrix or acrylic beads. Attachment of the 

25 protein to a solid support aids in separating peptide/binding partner pairs from other 
constituents found in the extract. The identified binding partners can be either a single 
protein or a complex made up of two or more proteins. Alternatively, binding partners 
may be identified using a Far- Western assay according to the procedm-es of Takayama et 
a/., (1997) Methods Mol Biol 69:171-184 or Sauder et al, (1996) J Gen Virol 77:991-996 

30 or identified through the use of epitope tagged proteins or GST fusion proteins. 



wo 2004/016637 




T/KR2003/001655 



Alternatively, the nucleic acid molecules of the invention can be used in a yeast 
two-hybrid system or other in vivo protein-protein detection system. The yeast two- 
hybrid system has been used to identify other protein partner pairs and can readily be 
adapted to employ the nucleic acid molecules herein described. 

H. Methods to Identify Agents that Modulate the Expression a Nucleic Acid 
Encoding the Genes Associated with Liver Cancer 

Another embodiment of the present invention provides methods for identifying 
agents that modulate the expression of a nucleic acid encoding a protein of the invention 
such as a protein having the amino acid sequence of SEQ ID NO: 2, 4, 6, 8 or 10. Such 
assays may utilize any available means of monitoring for changes in the expression level 
of the nucleic acids of the invention. As used herein, an agent is said to modulate the 
expression of a nucleic acid of the invention if it is capable of up- or down-regulating 
expression of the nucleic acid in a cell. 

In one assay format, cell lines that contain reporter gene fusions between 
nucleotides from within the open reading frame defined by nucleotides 155-421 of SEQ 
ID NO: 1, nucleotides 139-405 of SEQ ID NO: 3, nucleotides 32-1387 of SEQ ID NO: 5, 
nucleotides 41-1504 of SEQ ID NO: 7, or nucleotides 31-1554 of SEQ ID NO: 9 and/or 
the 5 'and/or 3' regulatory elements and any assayable fusion partner may be prepared. 
Numerous assayable fusion partners are known and readily available including the firefly 
luciferase gene and the gene encoding chloramphenicol acetyltransferase (Alam et aL, 
(1990) Anal Biochem 188:245-254). Cell lines containing the reporter gene fusions are 
then exposed to the agent to be tested under appropriate conditions and time. 
Differential expression of the reporter gene between samples exposed to the agent and 
control samples identifies agents which modulate the expression of a nucleic acid of the 
invention. 

Additional assay formats may be used to monitor the ability of the agent to 
modulate the expression of a nucleic acid encoding a protein of the invention, such as the 
protein having SEQ ID NO: 2, 4, 6, 8 or 10. For instance, mRNA expression may be 
monitored directly by hybridization to the nucleic acids of the invention. Cell lines are 
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exposed to the agent to be tested under appropriate conditions and time and total RNA or 
roRNA is isolated by standard procedures such those disclosed in Sambrook et aL, 
Molecular Cloning - A Laboratory Manual, Third Ed., Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, NY, 2001. 

5 The preferred cells will be those derived from human liver tissue, for instance, 

liver biopsy tissue or cultured cells from patients with liver cancer, or liver cancer and 
cirrhosis, or liver cancer and hepatitis. Cell lines such as ATCC hepatocellular 
carcinoma cell lines Catalogue Nos. HB-8064, HB-8065 or CRL-10741 may be used. 
Alternatively, other available cells or cell lines may be used. 

10 Probes to detect differences in KNA expression levels between cells exposed to 

title agent and control cells may be prepared from the nucleic acids of the invention. It 
is preferable, but not necessary, to design probes which hybridize only with target 
nucleic acids under conditions of high stringency. Only highly complementary nucleic 
acid hybrids form under conditions of high stringency. Accordingly, the stringency of 

15 the assay conditions determines the amount of complementarity which should exist 
between two nucleic acid strands in order to form a hybrid. Stringency should be chosen 
to maximize the difference in stability between the probertarget hybrid and 
probernon-target hybrids. 

Probes may be designed from the nucleic acids of the invention through 

20 methods known in the art. For instance, the G+C content of the probe and the probe 
length can affect probe binding to its target sequence. Methods to optimize probe 
specificity are commonly available in Sambrook et aL, supra^ or Ausubel et aL, Short 
Protocols in Molecular Biology, Fourth Ed., John Wiley & Sons, Inc., New York, 1999. 

Hybridization conditions are modified using known methods, such as those 

25 described by Sambrook et aL and Ausubel et al. as required for each probe. 
Hybridization of total cellular RNA or RNA enriched for polyA RNA can be 
accomplished in any available fomiat. For instance, total cellular RNA or RNA 
enriched for polyA RNA can be affixed to a solid support and the solid support exposed 
to at least one probe comprising at least one, or part of one of the sequences of the 

30 invention under conditions in which the probe will specifically hybridize. Altematively, 
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nucleic acid fragments comprising at least one, or part of one of the sequences of the 
invention can be affixed to a solid support, such as a silicon chip, porous glass wafer or 
membrane. The solid support can then be exposed to total cellular RNA or polyA RNA 
from a sample under conditions in which the affixed sequences will specifically 
hybridize. Such solid supports and hybridization methods are widely available, for 
example, those disclosed by Beattie, (1995) WO 95/11755. By examining for the 
ability of a given probe to specifically hybridize to an RNA sample from an imtreated 
cell population and from a cell population exposed to the agent, agents which up- or 
down-regulate the expression of a nucleic acid encoding the protein having the sequence 
of SEQ ID NO: 2, 4, 6, 8 or 10 are identified. 

Hybridization for qualitative and quantitative analysis of mKNAs may also be 
carried out by using a RNase Protection Assay (i.e., RPA, see Ma et aL, (1996) Methods 
10:273-238). Briefly, an expression vehicle comprising cDNA encoding the gene 
product and a phage specific DNA dependent RNA polymerase promoter (e.g., T7, T3 
or SP6 RNA poljonerase) is linearized at the 3' end of the cDNA molecule, downstream 
from the phage promoter, wherein such a linearized molecule is subsequently used as a 
template for synthesis of a labeled antisense transcript of the cDNA by in vitro 
transcription. The labeled transcript is then hybridized to a mixture of isolated RNA (/.e., 
total or fractionated mRNA) by incubation at 45 overnight in a buffer comprising 
80% formamide, 40 mM Pipes, pH 6.4, 0.4 M NaCl and 1 mM EDTA The resulting 
hybrids are then digested in a buffer comprising 40 jiig/ml ribonuclease A and 2 p.g/ml 
ribonuclease. After deactivation and extraction of extraneous proteins, the samples are 
loaded onto urea/polyacrylamide gels for analysis. 

In another assay, to identify agents which affect the expression of the instant 
gene products, cells or cell lines are first identified which express the gene products of 
the invention physiologically. Cell and/or cell lines so identified would be expected to 
comprise the necessary cellular machinery such that the fidelity of modulation of the 
transcriptional apparatus is maintained with regard to exogenous contact of agent with 
appropriate surface transduction mechanisms and/or the cytosolic cascades. Further, 
such cells or cell lines would be transduced or transfected with an expression vehicle 
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{e.g.y a plasmid or viral vector) construct comprising an operable non-translated 
5 'promoter-containing end of the structural gene encoding the instant gene products 
fused to one or more antigenic fragments, which are peculiar to the instant g©ae products, 
wherein said fragments are under the transcriptional control of said promoter and are 
expressed as polypeptides whose molecular weight can be distinguished from the 
naturally occurring polypeptides or may further comprise an immunologically distinct 
tag or other detectable marker. Such a process is well known in the art (see Sambrook 
et aL, supra). 

Cells or cell lines transduced or transfected as outlined above are then contacted 
with agents under appropriate conditions. For example, the agent in a pharmaceutically 
acceptable excipient is contacted with cells in an aqueous physiological buffer such as 
phosphate buffered saline (PBS) at physiological pH, Eagles balanced salt solution (BSS) 
at physiological pH, PBS or BSS comprising serum or conditioned media comprising 
PBS or BSS and/or serum incubated at 37 ^'C. Said conditions may be modulated as 
deemed necessary by one of skill in the art. Subsequent to contacting the cells with the 
agent, said cells will be disrupted and the polypeptides of the lysate are fractionated such 
that a polypeptide fraction is pooled and contacted with an antibody to be fiirttier 
processed by immunological assay (e.g^., ELISA, iromunoprecipitation or Western blot). 
The pool of proteins isolated from the "agent-contacted" sample will be compared with a 
control sample where only the excipient is contacted with the cells and an increase or 
decrease in the immunologically generated signal from the "agent-contacted" sample 
compared to the control will be used to distinguish the effectiveness of the agent. 

H. Methods to Identil^ Agents that Modulate the Level or at Least One Activity of 
the Liver Cancer Associated Proteins 

Another embodiment of the present invention provides methods for identifying 
agents that modulate the level or at least one activity of a protein of the invention such as 
the protein having the amino acid sequence of SEQ ID NO: 2, 4, 6, 8 or 10. Such 
methods or assays may utilize any means of monitoring or detecting the desired activity. 

In one format, the relative amounts of a protein of the invention between a cell 
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population that has been exposed to the agent to be tested compared to an un-exposed 
control cell population may be assayed. In this format, probes such as specific 
antibodies are used to monitor the differential expression of the protein in the different 
cell populations. Cell lines or populations are exposed to the agent to be tested under 

5 appropriate conditions and time. Cellular lysates may be prepared firom the exposed 
cell line or population and a control, unexposed cell line or population. The cellular 
lysates are then analyzed with the probe. 

Antibody probes are prepared by immimizing suitable mammalian hosts in 
appropriate immunization protocols using the peptides, polypeptides or proteins of the 

10 invention if they are of sufficient lengthy or, if desired, or if required to enhance 
immunogenicity, conjugated to suitable carriers. Methods for preparing immxmogenic 
conjugates with carriers such as BSA, KXH, or other carrier proteins are well known in 
the art. In some circumstances, direct conjugation using, for example, carbodiimide 
reagents may be effective; in other instances linking reagents such as those supplied by 

15 Pierce Chemical Co. (Rockford, IL), may be desirable to provide accessibility to the 
hapten. The hapten peptides can be extended at either the amino or carboxy terminus 
with a cysteine residue or interspersed with cysteine residues, for example, to facilitate 
linking to a carrier. Administration of the immunogens is conducted generally by 
injection over a suitable time period and with use of suitable adjuvants, as is generally 

20 imderstood in the art. During the immunization schedule, titers of antibodies are taken 
to deteraiine adequacy of antibody formation. 

While the polyclonal antisera produced in this way may be satisfactory for some 
applications, for pharmaceutical compositions, use of monoclonal preparations is 
preferred. Immortalized cell lines which secrete the desired monoclonal antibodies may 

25 be prepared using the standard method of Kohler and Milstein ((1975) Nature 
256:495-497) or modificationis which effect immortalization of lymphocytes or spleen 
cells, as is generally known. The immortahzed cell lines secreting the desired 
antibodies are screened by immunoassay in which the antigen is the pq)tide hapten, 
polypeptide or protein. When the appropriate inunortalized cell culture secreting the 

30 desired antibody is identified, the cells can be cultured either in vitro or by production in 



wo 2004/016637 



25 



:T/KR2003/0016S5 



ascites fluid. 

The desired monoclonal antibodies are then recovered frora the culture 
supernatant or firom the ascites supernatant. Fragments of the monoclonal antibodies or 
the polyclonal antisera which contain the immunologically significant (antig^-binding) 

5 portion can be used as antagonists, as well as the intact antibodies. Use of 
immunologically reactive (antigen-binding) antibody fragments, such as the Fab, Fab% or 
F(ab')2 fi'agments is often preferable, especially in a therapeutic context, as these 
firagments are generally less immxmogenic than the whole immunoglobulin. 

The antibodies or antigen-binding fragments may also be produced, using 

10 current technology, by recombinant means. Antibody regions that bind specifically to 
the desired regions of the protein can also be produced in the context of chimeras with 
multiple species origin, such as humanized antibodies. 

Agents that are assayed in the above method can be randomly selected or 
rationally selected or designed. As used herein, an agent is said to be randomly 

15 selected when the agent is chosen randomly without considering the specific sequences 
involved in the association of a protein of the invention alone or with its associated 
substrates, binding partners, etc. An example of randomly selected ag^ts is the use a 
chemical library or a peptide combinatorial library, or a growth broth of an organism. 

As used herein, an agent is said to be rationally selected or designed when the 

20 agent is chosen on a nonrandom basis which takes into account the sequence of the target 
site and/or its conformation in connection with the agent*s action. Agents can be 
rationally selected or rationally designed by utiUzing the peptide sequences that make up 
these sites. For example, a rationally selected peptide agent can be a peptide whose 
amino acid sequence is identical to or a derivative of any functional consensus site. 

25 The agents of the present invention can be, as examples, peptides, small 

molecules, vitamin derivatives, as well as carbohydrates. Dominant negative proteins, 
DNAs encoding these proteins, antibodies to these proteins, peptide fragments of these 
proteins or mimics of these proteins may be introduced into cells to affect function. 
"Mimic" used herein refers to the modification of a region or several regions of a peptide 

30 molecule to provide a structure chemically different from the parent peptide but 
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topographically and functionally similar to the parent peptide (see Grant in: Molecular 
Biology and Biotechnology, Meyers, ed.. pp. 659-664, VCH Publishers, Inc., New York, 
1995). A skilled artisan can readily recognize that there is no limit as to the structural 
nature of the agents of the present invention. 

5 The peptide agents of the invention can be prepared using standard solid phase 

(or solution phase) peptide synthesis methods, as is known in the art. In addition, the 
DNA encoding these peptides may be synthesized using commercially available 
oligonucleotide synthesis instrumentation and produced recombinantly using standard 
recombinant production systems. The production using solid phase peptide synthesis is 

10 necessitated if non-gene-encoded amino acids are to be included. 

Another class of agents of the present invention are antibodies immunoreactive 
with critical positions of proteins of the invention. Antibody agents are obtained by 
immunization of suitable mammalian subjects with peptides, containing as antigenic 
regions, those portions of the protein intended to be targeted by the antibodies. 

15 

J. Uses for Agents that Modulate the Expression or at Least one Activity of the 
Proteins Associated with Liver Cancer 

As provided in the Examples, the proteins and nucleic acids of the invention, 
such as the proteins having tiie amino acid sequrace of SEQ ID NO: 2, 4, 6, 8 or 10, are 
20 differentially expressed in cancerous Uver tissue. Agents that up- or down- regulate or 
modulate the expression of the protein or at least one activity of the protein, such as 
agonists or antagonists, of may be used to modulate biological and pathologic processes 
associated with the protein's function and activity. 

As used herein, a subject can be any mammal, so long as the mammal is in need 
25 of modulation of a pathological or biological process mediated by a protein of the 
invention. The term "mammal" is defined as an individual belonging to the class 
Mammalia. The invention is particularly useful in the treatment of human subjects. 

Pathological processes refer to a category of biological processes which produce 
a deleterious effect. For example, expression of a protein of the invention may be 
30 associated with liver cell growth or hyperplasia. As used herein, an agent is said to 
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modulate a pathological process when the agent reduces the degree or severity of the 
process. For instance, liver cancer may be prevented or disease progression modulated 
by the administration of agents which up- or down-regulate or modulate in some way the 
expression or at least one activity of a protein of the invention. 

The agents of the present invention can be provided alone, or in combination 
with other agents that modulate a particular pathological process. For example, an 
agent of the present invention can be administered in combination with other known 
drugs. As used herein, two agents are said to be administered in combination when the 
two agents are administered simultaneously or are administered independently in a 
fashion such that the agents will act at tiie same time. 

The agents of the present invention can be administered via parenteral, 
subcutaneous, intravenous, intramuscular, intraperitoneal, transdermal, or buccal routes. 
Altematively, or concurrently, administration may be by the oral route. The dosage 
administered will be dependent upon the age, health, and weight of the recipient, kind of 
concurrent treatment, if any, frequency of treatment, and the nature of the effect desired. 

The present invention further provides compositions containing one or more 
agents which modulate expression or at least one activity of a protein of the invention. 
While individual needs vary, determination of optimal ranges of effective amounts of 
each component is within the skill of the art. Typical dosages comprise 0.1 to 100 
\xg/kg body wt. The preferred dosages comprise 0.1 to 10 yig/kg body wt. The most 
preferred dosages comprise 0.1 to 1 p-g/kg body wt. 

In addition to the pharmacologically active agent, the compositions of the 
present invention may contain suitable pharmaceutically acceptable carriers comprising 
excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutically for delivery to the site of action. 
Suitable formulations for parenteral administration include aqueous solutions of the 
active compounds in water-soluble form, for example, water-soluble salts. In addition, 
suspensions of the active compoimds as appropriate oily injection suspensions may be 
administered. Suitable lipophilic solvents or vehicles include fatty oils, for example, 
sesame oil, or synthetic fatty acid esters, for example, ethyl oleate or triglycerides. 
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Aqueous injection suspensions may contain substances which increase the viscosity of 
the suspension include, for example, sodium carboxymethyl cellulose, sorbitol, and/or 
dextran. Optionally, the suspension may also contain stabilizers. Liposomes can also 
be used to encapsulate the agent for delivery into the cell. 

The pharmaceutical formxilation for systemic administration according to the 
invention may be formulated for enteral, parenteral or topical administration. Indeed, 
all three types of formulations may be used simultaneously to achieve systemic 
administration of the active ingredient. 

Suitable formulations for oral administration include hard or soft gelatin 
capsules, pills, tablets, including coated tablets, elixirs, suspensions, syraps or 
inhalations and controlled release forms thereof. 

In practicing the methods of this invention, the compounds of this invention 
may be used alone or in combination, or in combination with other therapeutic or 
diagnostic agents. In certain preferred embodiments, the compounds of this invention 
may be coadministered along with other compounds typically prescribed for these 
conditions according to generally accepted medical practice. The compounds of this 
invention can be utilized in vivo^ ordinarily in mammals, such as humans, sheep, horses, 
cattle, pigs, dogs, cats, rats and mice, or in vitro. 

K. Transgenic Animals 

Transgenic animals containing mutant, knock-out or modified genes 
corresponding to the cDNA sequence of SEQ ID NO: 1, 3, 5, 7 or 9, or the open reading 
firame encoding the polypeptide sequence of SEQ ID NO: 2, 4, 6, 8 or 10 or fragments 
thereof having a consecutive sequence of at least about 3, 4, 5, 6, 10, 15, 20, 25, 30, 35 
or more amino acid residues, are also included in the invention. Transgenic animals are 
genetically modified animals into which recombinant, exogenous or cloned genetic 
material has been experimentally transferred. Such genetic material is often referred to 
as a "transgene." The nucleic acid sequence of the transgene, in this case a form of 
SEQ ID NO: 1, 3, 5, 7 or 9, may be integrated either at a locus of a genome where that 
particular nucleic acid sequence is not otherwise normally found or at the normal locus 
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for the transgene. The transgene may consist of nucleic acid sequences derived from the 
genome of the same species or of a different species than the species of the target animal. 

In some embodiments, transgenic animals in which all or a portion of a gene 
comprising SEQ ID NO: 1, 3, 5, 7 or 9 is deleted may be constructed. In those cases 

5 where the gene corresponding to SEQ ID NO: 1, 3, 5, 7 or 9 contains one or more introns, 
the entire gene- all exons, introns and the regulatory sequences- may be deleted. 
Alternatively, less than the entire gene may be deleted. For example, a single exon 
and/or intron may be deleted, so as to create an animal expressing a modified version of 
a protein of the invention. 

10 The term "germ cell line transgenic animal" refers to a transgenic animal in 

which the genetic alteration or genetic information was introduced into a germ Une cell, 
tiiereby conferring the ability of the transgenic animal to transfer the genetic information 
to offspring. If such ofiFspring in fact possess some or all of that alteration or genetic 
information, then they too are transgenic animals. 

15 The alteration or genetic information may be foreign to the species of animal to 

which the recipient belongs, foreign only to the particular individual recipient, or may be 
genetic information already possessed by the recipient. In the last case, the altered or 
introduced gene may be expressed differently than the native gene. 

Transgenic animals can be produced by a variety of different methods including 

20 transfection, electroporation, microinjection, gene targeting in embryonic stem cells and 
recombinant viral and retroviral infection {see, e.g,, U.S. Patent No. 4,736,866; U.S. 
Patent No. 5,602,307; Mullms et aL, (1993) Hypertension 22:630-633; Brenin et aL, 
(1997) Surg Oncol 6:99-110; Recombinant Gene Expression Protocols (Methods in 
Molecular Biology, Vol. 62). Tuan, ed., Humana Press, Totowa, NJ, 1997). 

25 A number of recombinant or transgenic mice have been produced, including 

those which express an activated oncogene sequence (U.S. Patent No. 4,736,866); 
express simian SV40 T-antigen (U.S. Patent No. 5,728,915); lack the expression of 
interferon regulatory factor 1 (IRF-1) (U.S. Patent No. 5,731,490); exhibit dopaminergic 
dysfunction (U.S. Patent No. 5,723,719); express at least one human gene which 

30 participates in blood pressure control (U.S. Patent No. 5,731,489); display greater 
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similarity to the conditioiis existing in naturally occurring Alzheimer*s disease (U.S. 
Patent No. 5,720,936); have a reduced capacity to mediate cellular adhesion (U.S. Patent 
No. 5,602,307); possess a bovine growth hormone gene (Clutter et al.^ (1996) Genetics 
143:1753-1760); or, are capable of generating a fiiUy hmnan antibody response 

5 (McCarthy (1997) Lancet 349:405), 

While mice and rats remain the animals of choice for most transgenic 
experimentation, in some instances it is preferable or even necessary to use altemative 
animal species. Transgenic procedures have been successfully utilized in a variety of 
non-murine animals, including sheep, goats, pigs, dogs, cats, monkeys, chimpanzees, 

10 hamsters, rabbits, cows and guinea pigs (see, e.^., Kim et aL, (1997) Mol Reprod Dev 
46:515-526; Houdebine, (1995) Reprod Nutr Dev 35:609-617; Fetters (1994) Reprod 
Fertil Dev 6:643-645; Schnieke et aL, (1997) Science 278:2130-2133; and Amoah, 
(1997) J Animal Science 75:578-585). 

The method of introduction of nucleic acid fragments into recombination 

15 competent mammaUan cells can be by any method which favors co-transformation of 
multiple nucleic acid molecules. Detailed procediires for producing transgenic animals 
are readily available to one skilled in the art, including the disclosures in U.S. Patent No. 
5,489,743 and U.S. Patent No. 5,602,307. 

20 L. Diagnostic Methods 

As the genes and proteins of the invention are differentially expressed in 
cancerous liver tissue (HCC) and in other carcinomas, compared to non-cancerous 
tissues, the genes and proteins of the invention may be used to diagnose or monitor liver 
cancer or other malignant neoplasms, to track disease progression, or to differentiate 

25 HCC tissue from cirrhotic tissue samples. One means of diagnosing liver cancer using 
the nucleic acid molecules or proteins of the invention involves obtaining tissue from 
Uving subjects, including Uver tissue. 

The use of molecular biological tools has become routine in forensic technology. 
For example, nucleic acid probes comprising all or at least part of the sequence of SEQ 

30 ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7 or SEQ ID NO: 9 may be used 



wo 2004/016637 




:T/KR2003/0016SS 



to determine the expression of a nucleic acid molecule in forensic/pathology specimens. 
Further, nucleic acid assays may be carried out by any means of conducting a 
traoscriptional profiling analysis. In addition to nucleic acid analysis, forensic methods 
of the invention may target the proteins of the invention, particularly a protein 

5 comprising SEQ ID NO: 2, 4, 6, 8 or 10 to determine up- or down-regulation of the 
genes (Shiverick et al^ (1975) Biochim Biophys Acta 393:124-133). 

Methods of the invention may involve treatment of tissues with coUagenases or 
other proteases to make the tissue amenable to cell lysis (Semenov et aL, (1987) Biull 
Eksp Biol Med 104:113-116). Further, it is possible to obtain biopsy samples £rom 

10 different regions of the Uver for analysis. 

Assays to detect nucleic acid or protein molecules of the invention may be in 
any available format. Typical assays for nucleic acid molecules include hybridization 
or PGR based formats. Typical assays for the detection of proteins, polypeptides or 
peptides of the invention include the use of antibody probes in any available format such 

15 as zn situ binding assays, etc. (see Harlow & Lane, Antibodies - A Laboratorv Manual, 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1988. In preferred 
embodiments, assays are carried-out with appropriate controls. 

The above methods may also be used in other diagnostic protocols, including 
protocols and methods to detect disease states in other tissues or organs, for example in 

20 tissues in which expression of a nucleic acid molecule of the invention is detected. 

Without further description, it is believed that one of ordinary skill in the art can, 
using tiie preceding description and the following illustrative examples, make and utilize 
the compounds of the present invention and practice the claimed methods. The 
25 following working examples therefore, specifically point out preferred embodiments of 
the present invention, and are not to be construed as limiting in any way the remainder of 
the disclosxire. 
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EXAMPLES 
Example la 

Identijacation of Differentially Expressed LBFL302 mRNA in Hepatocellular Carcinoma 
The patient tissue samples were derived from 10 Korean patients and classijaed 
5 into two groups of 5 patients each. One group of consisted of patients who had been 
diagnosed with chronic viral hepatitis B (CH) and who later developed hepatic 
carcinomas (HCC). The patients in this group, three men and two women, ranged in 
age from 34-63. The second group of patients had been diagnosed with cirrhosis of the 
liver (LC). These people also later developed hepatic carcinomas (HCC). In this 

10 group of 4 men and one woman, the patients ranged in age from 40-62. For each 
patient, tissue was obtained from two areas of the Uver to produce a set of biopsy 
samples. In the first patient group (cancer/hepatitis), samples were removed from liver 
tumors and from the non-cancerous surrounding area composed of inflamed tissue 
(inflanmiation due to hepatitis). In the second group (cancer/cirrhosis), liver tissue was 

15 removed from tumors and from the non-cancerous surrounding area composed of fibrotic 
tissue (areas of fibrosis due to cirrhosis). Histological analysis of each of the tissue 
samples was performed and samples were segregated into either non-cancerous or 
cancerous categories. 

With minor modifications, the sample preparation protocol followed the 

20 Affymetrix GeneChip Expression Analysis Manual. Frozen tissue was first groimd to 
powder using the Spex Certiprep 6800 Freezer Mill. Total RNA was then extracted 
using Trizol (Life Technologies). The total RNA yield for each sample (average tissue 
weight of 300 mg) was 200-500 ]ig. Next, mRNA was isolated using the Oligotex 
mRNA Midi kit (Qiagen). Since the mRNA was eluted in a final volume of 400 p,l, an 

25 ethanol precipitation step was required to bring the concentration to 1 iig/\il. Using 1-5 
|Lig of mRNA, double stranded cDNA was created using the Superscript Choice system 
(Gibco-BRL). First strand cDNA synthesis was primed with a T7-(dT24) 
oligonucleotide. The cDNA was then phenol-chloroform extracted and ethanol 
precipitated to a final concentration of 1 Kig/|al. 

30 From 2 ^g of cDNA, cRNA was synthesized according to standard procedures. 
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To biotin label the cRNA, nucleotides Bio-U-CTP and Bio-16-UTP (Enzo Diagnostics) 
were added to the reaction. After a 37 °C incubation for six hours, the labeled cRNA 
was cleaned up according to the Rneasy Mini kit protocol (Qiagen). The cRNA was 
then fragmented (5x fragmentation buffer: 200 mM Tris-Acetate (pH 8.1), 500 mM 
KOAc, 150 mM MgOAc) for thirty-five mmutes at 94 °C. 

55 ^ig of fragmented cRNA was hybridized on the AfiEymetrix Human Genome 
U95 and U133 set of arrays for twenty-four hours at 60 rpm in a 45 ^'C hybridization 
oven. The chips were washed and stained with Streptavidin Phycoerythrin (SAPE) 
(Molecular Probes) in AfiEymetrix fluidics stations. To amplify staining, SAPE solution 
was added twice with an anti-streptavidin biotinylated antibody (Vector Laboratories) 
staining step in between. Hybridization to the probe arrays was detected by 
fluorometric scanning (Hewlett Packard Gene Array Scanner). Following hybridization 
and scaiming, the microarray images were analj^ed for quality control, looking for major 
chip defects or abnormalities in hybridization signal. After all chips passed QC, the 
data was analyzed using Affymetrix Microarray Suite (v4.0), and LIMS (vl.5) for U95 
or Affymetrix Microarray Suite (v5.0), and LIMS (v3.0) for U133. 

DifTerential expression of genes between the cancerous and non-cancerous liver 
samples was determined by using Affymetrix human GeneChip sets, U95 and U133, with 
the following statistical methods. (1) For each gene, Affymetrix GeneChip average 
difference values for U95 were determined by Ai^nnetrix Microarray Suite (v4.0), which 
also made "Absent" (=not detected), 'Tresenf* (=detected) or '"Marginal" (=not clearly 
Absent or Present) calls for each GeneChip element Signal values for U133 were 
determined by Affymetrix Microarray Suite (v5.0), which also made Absent, Present or 
Marginal calls. (2) Using the criteria of at least 10% present call in both cancerous and 
non-cancerous liver samples and at least 40% present call in either cancerous or non- 
cancerous liver sample groups, a gene set was selected for ftirther analysis. (3) Based 
on the average difference values of U95 data, the gene set was split into two groups, a 
high expression group and low expression group. The high expression group contained 
genes with average difference values greater than or equal to 5 in both cancerous and 
non-cancerous samples. The remainder of the genes were included in tiie low 
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expression group. The average difference values were transformed to a logarithmic 
scale for the high expression group, but were not changed for the low expression group. 
For U133 data, all signal values were transformed to a logarithmic scale regardless of 
expression level. (4) The Analysis of Variance (ANOVA) method was used for data 

5 analysis (Steel et al.. Principles and Procedures of Statistics: A Biometrical Approach, 
Third Ed,. McGraw-Hill, 1997), Prior to the final analysis, a leave-one-out approach is 
used for outlier detection. One sample at a time was left out of the ANOVA analysis to 
determine whether or not omitting a specific sample firom the analysis had any 
significant effect on the final result. If so, that particular sample was excluded firom the 

10 final analysis. Aiter outUer detection, a list of genes that are differentially expressed 
with a p-value of less than or equal to 0.05 was generated by ANOVA. Data firom 
Affymetrix GeneChip U133 chip set was analyzed with a similar procedure. (5) Two 
additional criteria were used to reduce the number of genes in the gene list generated 
firom U95. Firstly, geometric mean values were compared between the non-cancerous 

15 control group samples and the carcinoma disease group samples to obtain a set of genes 
showing at least 2.0-fold increases or decreases in expression level. Secondly, the ratio 
of the fold-change value and the p-value had to be 400 or greater. 

Analysis of the chip data showed that the expression of the marker LBFL302 
was significantly up-regulated (6.:^-fold, p = 0.00116 for U95; 7.87-fold, p = 0.000944 

20 for U133) in hver carcinoma samples compared to samples firom cirrhotic liver tissue. 
Up-regulation (2.76-fold, p > 0.05 for U95; 4.64-fold, p = 0.0115 for U133) was also 
observed in liver carcinoma samples compared to tissue samples firom inflamed Uver 
tissue (biopsies from areas of inflammation in chronic hepatitis patients). These data 
indicate that up-regulation of LBFL302 may be diagnostic for liver cancer in people with 

25 cirrhosis and may also be diagnostic for liver cancer in people with chronic hepatitis. 

The expression level of LBFL302 (SEQ ID NO: 1 or 3) can be measured by chip 
sequence fragment nos. 51263_at and 226936_at on Affymetrix GeneChips® U95 and 
U133, respectively. The expression levels of 51263_at and 226936_at in various 
maUgnant neoplasms, compared to normal control tissues, are shown in Table la, where 

30 the fold-change and the direction of the change (up- or down-regulation) are also 
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indicated. A fold-change greater than 1.5 was considered to be significant. 



Table la 
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Table 2 summarizes the differential expression data collected from experiments 
using Affymetrix GeneChips by tissue type. The chips were scanned and the data 
analyzed by the GX Scan algorithm, which is described in related appUcations 
5 60/331,182, 60/388,745 and 60/390,608, all entitled "An Autonoiated Computer-based 
Algorithm for Organizing and Mining Gene Expression Data Derived from Biological 
Samples with Complex Clinical Attributes," and all of which are herein incorporated by 
reference in their entirety. 



10 Table 2 

LBFL302 is up-regulated in certain types of the following malignant neoplasms with a 
fold change of 1.5 and above: 

51263 at i=rom U9S data 226936 at From U133 data 



1. Bladder 


UP 




2. Breast 






(Infiltrating duct carcinoma) 


UP 


UP 


3. Cervix 


UP 


UP 


4. Cdlon 


UP 


UP 


5. Kidney 


UP 


UP 


6. Uver 


UP 


UP 


7. Lung 


UP 


UP 


8. Myometrium 


UP 




9. Ovary 


UP 


UP 


10. Pancreas 


UP 


UP 


ll.Rectum 


UP 


UP 


12.Sldn 


UP 


UP 


13.Smail Intestine 


UP 




14. Soft tissues 


UP 


UP 


IS.SpIeen 


UP 


UP 


16. Stomach 


UP 


UP 



15 The GeneChip expression results, determined by sample binding to chip 

sequence fragment no. 51263_at, were validated by quantitative RT-PCR (Q-RT-PCR) 
using the Taqman® assay (Perkin-Elmer). PGR primers designed from the sequence 
information file of the specific Affymetrix fragment (51263_at) were used in the assay. 
The target gene in each RNA sample (ten ng of total RNA) was assayed relative to an 

20 exogenously spiked reference gene. For this purpose, the tetracycline resistance gene 
was used as the exogenously added spike. This approach provides the relative 
expression as measured by cycle threshold (Ct) value of the target mKNA relative to a 
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constant amount of Tet spike Ct values. The sample panel included liver cirrhosis (LC), 
chronic hepatitis (CH) and hepatocellular carcinoma (HCC) tissue RNAs that were 
analyzed on U95 GeneChips. In addition, several new samples that were not analyzed 
on the GeneChip were used for the expression validations by Q-RT-PCR. The Q-RT- 
5 PGR data confirms the up-regulation of LBFL302 observed in HCC, compared to LC or 
CH biopsy samples. 

Example lb 

Identification of Differentiallv Expressed LBFL303 mRNA in Hepatocellular Carcinoma 

10 The patient tissue samples were derived firom 19 Korean patients and classified into 

two groups. One group of consisted of nine patients who had been diagnosed with 
chronic viral hepatitis B (CH) and who later developed hepatic carcinomas (HCC). The 
patients in this group, five men and four women, ranged in age from 34-65. The second 
group, of ten patients, had been diagnosed with cirrhosis of the liver (LC). These 

15 people also later developed hepatic carcinomas (HCC). In this group of eight men and 
two woman, the patients ranged in age from 37-62. The same procedures as in the 
above Example la were then carried out for each patient. 

Analysis of the data from U95 chips showed that expression of the marker 
corresponding to SEQ ID NO: 5, 7 or 9 was significantly up-regulated (934-fold, p- 

20 value = 1.44 X 10"^) in liver carcinoma samples compared to samples from cirrhotic Uver 
tissue. Data from U133 chips showed that expression of SEQ ID NO: 5 or 7 was also 
significantly up-regulated in Uvct carcinoma samples compared to samples from 
cirrhotic liver tissue (2.60-fold, p-value = 3.63 X 10'^) and compared to samples from 
inflamed liver tissue (5.69-fold, p = 8.99 X 10"^ for U95) (biopsies from areas of 

25 inflamm ation in chronic hepatitis patirats). These data indicate that up-regulation of 
SEQ ID NOS: 5, 7 and 9 may be diagnostic for liver cancer in people and, in particular, 
patients with cirrhosis or chronic hepatitis. 

The expression level of LBFL303 clones GE6, MBS or IE4 (SEQ ID NOS: 5, 7 or 
9, respectively) can be measured by chip sequence fragment nos. 46690_at (U95 chip) 

30 and 219175_at and 224931_at (U133 chip) on Af^etrix GeneChips®. DiflEerential 
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expression data were collected from experiments using Affymetrix GeneChips® by 
tissue type and were analyzed by the GX Scan algorithm, which is described in related 
applications 60/331,182, 60/388,745 and 60/390,608, all entitled "An Automated 
Computer-based Algorithm for Organizing and Mining Gene Expression Data Derived 

5 from Biological Samples with Complex Clinical Attributes," and all of which are herein 
incorporated by reference in their entirely. The expression levels of 46690_at, 
219175_at and 22493 l_at in various malignant neoplasms, compared to normal control 
tissues, are shown in Table 1, where the fold-change and the direction of the change (up- 
or down-regulation) are also indicated. A fold-change greater than 1.5 was considered 

10 to be significant. 
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Table lb 
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The GeneChip expression results, determined by sample binding to chip 
sequence fragment no. 46690_at, were validated by quantitative RT-PCR (Q-RT-PCR) 
using the Taqman® assay (Perkin-Ehner), as in the above Example la. 
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The Q-RT-PCR data confiims the up-regulation of the genes corresponding to 
LBFL303 clones GE6, MBS and IE4 observed in HCC, compared to LC or CH biopsy 
samples. 

5 Example 2 

Cloning of Full Length human cDNAs Corresponding to DifFerentially Expressed mRNA 
Species 

The full length cDNA having SEQ ID NO: 1, 3, 5, 7 or 9 was obtained by the 
oligo-pulling method. Briefly, a gene-specific oligo was designed based on the 

10 sequence of SEQ ID NO: 1, 3, 5, 7 or 9. The oligo was labeled with biotin and used to 
hybridize with 2 fig of single strand plasmid DNA (cDNA recombinants) from a fully 
differentiated stomach adenocarcinoma Ubrary (NCI CGAP Gas 4) following the 
procedures of Sambrook et al The hybridized cDNAs were separated by 
streptavidin-conjugated beads and eluted by heating. The eluted cDNA was converted 

15 to double strand plasmid DNA and used to transform E. coli cells (DHIOB) and the 
longest cDNA was screened. After positive selection was confirmed by PGR using 
gene-specific primers, the cDNA clone was subjected to DNA sequencing. 

The nucleotide sequence of the full-length human cDNAs corresponding to the 
differentially regulated mRNA detected above is set forth in SEQ ID NOS: 1, 3, 5, 7 and 

20 9. The cDNA of SEQ ID NO: 1 comprises 578 base pairs (531 base pairs and a polyA 
tail), and the cDNA of SEQ ID NO: 3 comprises 531 base pairs (515 base pairs and a 
polyA tail). The cDNA of SEQ ID NO: 5 comprises 2067 base pairs (2040 base pairs 
and a polyA tail), the cDNA of SEQ ID NO: 7 comprises 2178 base pairs (2162 base 
pairs and a polyA tail), and the cDNA of SEQ ID NO: 9 comprises 1616 bases pairs 

•25 (1 598 base pairs and a polyA tail). 

An open reading frame within the cDNA nucleotide sequence of SEQ ID NO: 1, 
at nucleotides 155-418 (155-421 including the stop codon), encodes a protein of 88 
amino acids. The amino acid sequence corresponding to a predicted protein encoded by 
SEQ ID NO: 1 is set forth in SEQ ID NO: 2. 

30 An open reading firame within the cDNA nucleotide sequence of SEQ ID NO: 3, 
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at nucleotides 139-402 (139-405 including the stop codon), also encodes a protein of 88 
amino acids. The amino acid sequence corresponding to a predicted protein encoded by 
SEQ ID NO: 3 is set forth in SEQ ID NO: 4. The protein sequence of SEQ ID NO: 4 is 
identical to that of SEQ ID NO: 2, except for the amino acid at position 28 (arginine in 

5 SEQ ID NO: 2, but leucme in SEQ ID NO: 4, caused by a G->T point mutation at 
nucleotide position no. 237 in SEQ ID NO: 1 and 221 in SEQ ID NO: 3), although the 
nucleic acid sequences encoding these proteins differ upstream of the coding region. 

SEQ ID NOS: 2 and 4 are weakly similar to histone-like transcription factor 
(CBF/NF-Y) and the archaeal histone signature. In addition, these amino acid 

10 sequences are weakly similar to the bacterial regulatory protein lysR family helix-tum- 
helix signature. This signature contains Ihree domains. The anuno acid sequences of 
SEQ ID NO:2 and SEQ ID NO: 4 are 22% identical to the two domains at flie C-tenninus. 

An open reading frame within the cDNA nucleotide sequence of SEQ ID NO: 5 
(GE6), at nucleotides 32-1384 (32-1387 including the stop codon), encodes a protein of 

15 451 amino acids. The anaino acid sequence corresponding to the protein encoded by 
SEQ ID NO: 5 is set forth in SEQ ID NO: 6, 

An open reading frame within the cDNA nucleotide sequence of SEQ ID NO: 7 
(MBS), at nucleotides 41-1501 (41-1504 including the stop codon), encodes a protein of 
487 amino acids. The amino acid sequence corresponding to the protein encoded by 

20 SEQ ID NO: 7 is set forth in SEQ ID NO: 8. The protein sequence of SEQ ID NO: 8 is 
identical to that of SEQ ID NO: 6, except for an insertion of 36 amino acids toward the 
amino terminus (see multiple sequence alignment below). 

An open reading frame within the cDNA nucleotide sequence of SEQ ID NO: 9 
(IE4), at nucleotides 31-1551 (31-1554 including the stop codon), encodes a protein of 

25 507 amino acids. The amino acid sequence corresponding to the protein encoded by 
SEQ ID NO: 9 is set forth in SEQ ID NO: 10. The protein corresponding to SEQ ID 
NO: 10 is identical to that of SEQ ID NO: 8 for the first 456 amino acids, although the 
remainders of the proteins have no homology (see multiple sequence alignment below). 



wo 2004/016637 




:T/KR2003/001655 



Multiple amino acid sequence alignment fbr GE6, MBS and TEA 



LBFIi303_GB6.pep 
iaPI303_MB5.pep 
liBTL303_XS4.pep 
Consensus 



IiBFIi303_GH6 . pep 
LBFL303_MB5.pep 
IiBFI<303_IE4 .pep 
Consensus 



1 SO 

MDGTETRQRR ZASCOKPGBIi GUPRPLSTC3G IjPVASEDQAL RAPHSQSVTP' 
MD6TBTRQRR LDSC6KPOBI< OIiPRPLSTGG XiPVASEDOAL RAPBSQSVTP 
MDOTBTRQRR LiDSCGKPGEIi GI<PHPLSTGO LPVASEDGAL RAPESQSVTP 
MDGTBTRQRR U)SGGKPOBIi GIiPHPLSTGG ItPVASEDQAL RAPESQSVTP 

51 XOO 

KPLBTBPSRB TAHSIGLQVT VPFMFAGtiGL SHAGMLLDYF Q 

KPI«BTHPSRB TAHSIGIiQVT VPFMFAGX/3L SnAGMI^YF QHWPVFVEVK 
KPIjBTBPSRE TAHSIGLQVT VPFMPAGLGI* SHAGMUUDYF QRWPVFVEVK 
XPIiBTBPSRE TAHSIGI^QfVT VFFNFA(aU3L SHAmLLDYF QHWPVFVEVK 



101 ISO 

IiBFI.303_GB6.pep ANT GQXDDPQEQH RVISSNLALI 

XiBFI<303_KB5.pep DliliTLVPPIiV GtiKGHLEmTt ASRLSTAANT GQXDDPQEQH RVISSNIiALI 
LBFL303_IB4 .pep DLIiTIiVPPLV GLKGNZJSMTI* ASRIiSTAAKT GQXDDPQEQH RVXSSNIiALI 
Consensus DLLTIiVPPLV GItKGHIiBMTL ASRLSTAAMT GQXDDPQEQH RVISSNIjALX 



iaFI.303jQBe.pep 
LBFLSOS^MBS.pep 
IiBFL3033xB4.pep 
Consensus 



151 200 
QVQATWGLIt AAVAALXjLQV VSRfiEVDVAK VEULCASSVL TAFXjAAFAK} 
QVQATWGIiIi AAVAAIiIiLGV VSRBBVDVAK VBLIKIASSVIj TAFLAAFALG 
QVQATWGLI* AAVAALLLGV VBRBEVDVAK VBLLCASSVL TAFIiAAFAIiG 
QVQAXWGLI< AAVAALIiLGV VSRBBVDVAK VSUiCASSVX* TAFIiAAFALG 



LBFIi303_GE6 .pep 
WVhlQjTms .pep 
LBFL3032XB4 .pep 
Consensus 



ZiBFXf3 03 j6E6 . pep 
IiBPI*303 MBS.pep 
LBPI.3033xB4.pep 
Consensus 



LBFL3 03_QB6 . pep 
LBFL303_MB5 .pep 
I;BFL303_XB4 .pep 
Consensus 



LBFL303JQB6 .pep 
XfiFL303_tm5 .pep 
LBFL3032XB4 .pep 

Consensus 



LBFL303_pE6 .pep 
LBFL303~HB5 .pep 
LBPL3 03_IB4 . pep 
G&isensus 



201 250 
VLMVCXVXGA RKLOVHTOHX ATPXAASLffi) LXTLSXIiALV SSFFySBKDS 
VLMVCIVXGA RKLGVHPDnX ATPXAASLGD LXTLSXUULV SSFFYRHKDS 
VLMVCIVTGA RKLOVNPDKI ATPXAASLO) LZTLSIItALV SSFFYRHKDS 
VLMVCXVXGA RKLQVNPDNI ATPXAASLGD IiITLSXLALV SSPFVRHKDS 

251 300 
RYLTPLVOuS FAALTPVHVL XAKQSPPIVK XLKFGWFPXI LAMVXSSFGG 
RYLTPLVCLS FAALTPVWVL XAKQSPPIVK ILKFGHFPXX LAMVXSSFGG 
RYLTPLVCLS FAALTFVnVL XAKQSPPIVK XLKFGWFPXI LAMVISSFGG 
RYLTPLVCLS FAALTPVITO^ XAKQSPPIVK ILKFGWFPII LAMVISSFGG 

301 350 
LILSKTVSKQ QYKGMAXFTP VICGVGGHLV AIQTSRXSTY LHHWSAPGVL 
LILSKTVSKQ QYKGMAIFTP VICGVGGHLV AIQTSRISTY LHHN8APOVL 
LILSKTVSKQ QYKGMAXFTP VXGGVGGMLV AIQTSRXSTY LBMHSAPGVL 
LILSKTVSKQ QYKGMAXFTP VIGGVOGNLV AIQTSRISTY LHMHSAPGVL 

351^ 400 

PLQ hPJuni fPM PCSTFCTSBI K6MSARVLLL LWPGHLIFF YXIYLVEGQS 
PLQMKKFHFH PCSTFCTSBI NSMSARVLLL LWPGHLIFF YXIYLVBGQS 
PLQM KKFW PH PCSTFCTSBI HSMSARVLLL LWPGHLIFF YXXVLVBGQS 
PLQMKKFWPH PCSTFCTSBI IISMSARVLLL LWPGHLIFF YIIVLVBGQS 

401 450 
VINSQTFWL YLLAGLIQVT XLLYLAEVKV RtiTHHQALDP DNHCXPYLTG" 
VIHSQTFWL YLLAGLIQVT ILLYLAEVMV RLTWHQALDP DUHCXPYLTG 
VmSQTFWL YLLAGLIQVT ILLYLAEVMV RLTWHQALDP DNHCIPYLTG 
VIHSQTFWL YLLAGLIQVT ILLYLAEVMV RLTWHQALDP DNHCIPYLTG 



451 



LBPL3 03_GB6 . pep 
LBFIi303_MBS .pep 
LBFL3 0 3_IB4 , pep 
Consensus 



SCO 



LGDLLQTQLL ALCPFTDWLL KSKABLGGIS ELASGPP* — 

WS3LLOTGLL ALCFFTDWLL "KSKABLGGIS BLASGPP*— 
LGDLLGSSSV GHTAAVPI^ TASPGHOLIQ PFICTQHLIV SLLSFYFPFC 
U3DLLGTGLL ALCPFTDWLL KSKABLGGIS BLASOPP*-* 

501 



LBFL303j(^B6.pep 

LBFL303J(IB5.pep ' — ^ — 

LBFL303_IB4 . pep LLAKT8I« 

Consensus 



The LBFL303 clones also exhibit partial homology to a moiise homologue 
(GenBank Accession No. XM„1 32686). SEQ ID NO: 5 shows 44% identity over the 
entire contiguous sequence and 68% identity within the open reading frame, while SEQ 
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ID NO: 7 shows 46% identity over the entire contiguous sequence and 69% identity 
within the open reading frame. SEQ ED NO: 9 shows 61% identity over the entire 
contiguous sequence and 64% identity within the open reading frame to the mouse 
nucleic acid sequence. 

5 SEQ ID NOS: 6, 8 and 10 contain a divalent cation transporter signature that has 

two domains. SEQ ID NOS: 6 and 8 contain both domains (at amino acid positions 85- 
205 and 282-428 in SEQ ID NO: 6, and at amino acid positions 105-241 and 318-464 in 
SEQ ID NO: 8), although SEQ ID NO: 10 contains only the N-teraiinal domain (from 
amino acid residues 105-241). In addition, SEQ ID NOS: 6, 8 and 10 contain a 

10 peroxidase-active site at amino acid positions 390-401 in SEQ ID NO: 6 and at amino 
acid positions 426-437 in SEQ ID NOS: 8 and 10, 

Figures 1, 2, 3, 4 and 5 show the results of a hydrophobicity analysis of the 
amino acid sequence of SEQ ID NOS: 2, 4, 6, 8 and 10. Hydrophilic regions may be 
used to produce antigenic peptides, as described above. 

15 Analysis by Northern blot was performed to determine the size of the mRNA 

transcripts that correspond to SEQ ID NOS: 1, 3, 5, 7 and 9. Northern blots containing 
total RNAs from various human tissues were used (ClonTech), and each of clone BC7 
(SEQ ID NO: 3), clone GE6 (SEQ ID NO: 5), clone MBS (SEQ ID NO: 7), clone IE4 
(SEQ ID NO: 9) was radioactively labeled by the random primer method and used to 

20 probe the blots. The blots were hybridized in Church and Gilbert buffer at 65 °C and 
washed with O.IX SSC containing 0.1% SDS at room temperature. The Northem blots 
show a single transcript for each LBFL302 and 303 gene, which is approximately 0.65 
kb and 2,45 kb in size. This corresponds to the size of the insert in clones BC7 and 
BC4 (SEQ ID NOS: 3 and 1), 0.531 kb and 0.578 kb, respectively, and 2.2 kb for GE6, 

25 2.3 kb for MB5, and 1.8 kb for IE4, SEQ ID NOS: 5, 7 and 9, respectively. 

To examine the expression of SEQ ID NO: 1, 3, 5, 7 or 9 in various normal tissues, an 
electronic Northem blot (e-Northem) was prepared as follows. Using the chips and the 
procedures in Example 1, mRNA from a panel of 46 normal tissues, as listed in Table 3, 
was hybridized to Affymetrix U95 human GeneChips. The results of these experiments 

30 are shown in Table 3. For each tissue type, the number of samples that are called 
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present or absent are indicated, together with the total number of samples in that sample 
set. In addition, the median value and the 25^^ and 75* percentiles in each tissue type 
are listed* Interestingly, although this gene is up-regulated in liver cancer, expression of 
LBFL302 and 303 could not be detected in most normal liver samples. This 
observation indicates that L6FL302 and 303 may be used as a diagnostic agent or marker 
to detect liver cancer or to differentiate hepatocellular carcinoma from cirrhotic liver 
tissue, as discussed below. Expression levels of LBFL302 appeared to be highest in the 
thymus, followed by organs of the reproductive system (testis, endometrium, 
myometrixmi, uterus, cervix and breast) and of the digestive system (esophagus, rectum, 
colon and appendix). Expression levels of LBFL303 appeared to be highest in the brain 
(cerebellum, frontal cortex, temporal cortex and hippocampus) and in organs of the 
reproductive system (testis, endometrium, myometrium, uterus, cervix and breast), but 
expression could be detected at lower levels in most other tissues. Expression in the 
liver was the lowest. 
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Table 3a- e-Northem Table for 51263_at: LBFL302 Gene Expression in Noimal Tissues 



Global 
Present 
Freq. 


Tissue 


Present 


Absent * i 


^ower 25% 


IVledian I 


fnner 75% 


0.7176 
















Idipose 


26 of 32 


6 of 32 


32.78 


40.25 


56.11 




Adrenal Gland 


5 of 12 


7 of 12 


10.18 


28.24 


42.42 




Appendix 


3 of 3 


Oof 3 


65^6 


75.82 


78,15 


/ 


Vitery 


3 of 3 


Oof 3 


39.18 


43.53 


48.12 


1 


^laddo- 


.5 of 5 


Oof 5 


49,95 


81.92 


89.71 


I 


3 one 


3 of 3 


Oof 3 


73J7 


112.71 


113.59 


I 


breast 


76 of 80 


4 of 80 


56.53 


79.61 


107.84 


( 


!>erebellum 


1 of 5 


4 of 5 


10 .58 


16.98 


17.79 




Cervix 


96 of 101 


5 of 101 


75.45 


102.71 


139.70 




Z!olon 


147 of 151 


4 of 151 


7Z44 


107.19 


149.44 




I>ortex Frontal Lobe 


2 of 7 


5 of 7 


11.01 


17.48 


2L95 




Z^OrteK Tenryiral 

Lobe 


Oof 3 


3 of 3 


*1.09 


10.90 


11,83 




[>uodenum 


58 of 61 


3 of 61 


57^9 


72.35 


94.63 




^dometiium 


21 of 21 


Oof 21 . 


123.36 


151.44 


185,97 




Bsophagns 


26 of 27 


1 of 27 


98.53 


12738 


167.07 




Fallopian Tube 


46 of 51 


5 of 51 


37.81 


52.34 


94.69 




GallBladder 


2 of 8 


6 of 8 


13.23 


32.72 


39.09 




Heart 


lof3 


2 of 3 


8.24 


13.13 


15.87 




Hippocammis 


2 of 5 


3 of 5 


17^ 


19.47 


31.10 




Kidney 


24 of 86 


62 of 86 


8.63 


1436 


20.22 




Laiynx 


4 of 4 


0of4 


40^7 


97.82 


163.75 




Left Atrhnn 


30 of 141 


111 of 141 


11.25 


14.57 


19.93 




Left Ventricle 


Oof 15 


15 of 15 


4.77 


8.67 


12.80 




Liver 


10 of 34 


24 of 34: 


2.08 


10.42 


19.94 




LimR 


62 of 93 


31 of 93 


23.04 


35.46 


55.34 




Lymph Node 


lOof 11 


lofll 


58.47 


85.56 


98.77 




Kfuscles 


9 of 39 


30 of 39: 


9.03 


16.64 


. 27.86 




f^yometrinm 


105 of 106 


1 of 106 


123.43 


180.91 


216.99 




Omentum 


14 of 15 


lofl5 


46.67 


67,85 


151.98 




Ovary 


49 of 74 


25 of 74 


24.22 


34 J4 


55.81 • 




Pancreas 


2 of 34 


32 of 34 


-2.88 


6.00 


14.34 




Placenta 


3 of 5 


2 of 5 


28.65 


40.79 


53.18 




Prostate 


30 of 32 


2 of 32 


4439 


59,04 


73.73 




Rectum 


42 of 43 


lof43 


97.69 


133.62 


173.02 




Right Atrium 


34 of 169 


135 of 169 


9.78 


14.77 


1939 




EUght Ventricle 


21 of 160 


139 of 160 


4.93 


10.56 


18.22 




Skin 


50 of 59 


9 of 59 


44^1 


64.27 


93.29 




Small Intestine 


60 of 68 


8 of 68 


41.12 


70.61 


103^4 




Soft Tissues 


4 of 6 


2 of 6 


26.64 


61.11 


115.92 




Spleen 


22 of 31 


9 of 31 


26.94 


39,23 


52.19 




Stomach 


34 of 47 


13 of 47 


22.49 


41.88 


80.51 




Testis 


5 of 5 


Oof 5 


137.44 


236,97 


359.92 




Thymus 


71 of 71 


Oof 71 


26131 


32Z22 


358.16 




Thyroid Gland 


6 of 18 


12 of 18 


10.27 


15.12 


31.99 




Uterus 


58 of 58 


Oof 58 


88.62 


140.06 


190.83 




WBC 


12 of 40 


28or40 


10.23 


15,37 


24.59 
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Table 3b- e-Northem Table for 46690_at: LBFL303 Gene Expression in Nonnal Tissues 



Global 
Present 


Tissue 


Present 


Absent 


Lower 25% 


Metfian 


Upper 75% 


















AQipose 


21 of 32 


11 of 32 


176.42 


227.48 


291.00 




Aarenal Gland 


10 of 12 


2 of 12 


195.93 


286.51 


364.93 




Appendix 


2of3 


1of3 


59.10 


61.20 


71.28 




Artery 


3 of 3 


0of3 


257.57 


268.59 


334.19 




oiddder 


4 of 5 


lofS 


22251 


236.51 


310.37 




Bone 


2of3 


1of3 


241.64 


263.23 


269.74 




Breast 


74 of 80 


6 of 80 


208.22 


279.85 


33228 




V/ereDeuuit) 
Cervix 


5 of 5 
88 of 101 


Oof 5 


333.02 


348.16 


421.84 




vOion 


70 of 151 


13 of 101 
81 of 151 


212.62 
63.11 


292.96 
98.21 


344.12 
14228 




i/Oriex Frontal Lobe 


7of7 


Oof? 


293.14 


313.78 


323.67 




uoitex reinporal Lobe 


3of3 


0of3 


255.54 


280.80 


339.41 




uuooenum 


31 of 61 


30 of 61 


86.34 


113.58 


143.09 




EuKKMnetnuni 


18 of 21 


3 of 21 


243.49 


259.81 


299.21 




csopna9us 


17 of 27 


10 of 27 


120.68 


166.05 


207.75 




Fallopian Tube 


45 of 51 


6 of 51 


236.80 


296.76 


1 358.36 




uanDiaacier 


8 of 8 


Oof 8 


250.23 


277.49 


328.73 




nean 


2of3 


1of3 


128.08 


129.40 


139.86 




nippocaiDpus 


5 of 5 


OofS 


216.82 


302.36 


337.49 




isKiney 


53 of 86 


33 of 86 


155.75 


202.37 


235.42 




• Larynx 


3 of 4 


1of4 


- 189.07 


222.38 


268.94 




Leu Ainum 


75 of 141 


.66 of 141 


130.83 


173:47 


220.14 




Len venuicie 


8 of 15 


7of 15 


117.45 


175.84 


208.91 




• Uver 


5of34 


29 of 34 


8:65 


33.49 


55.39 




Lung 


52 of 93 


41 of 93 


115.09 


167.11 


.232.69 




Lympn Noae 


8 of 11 


3oM1 


119.03 


167.95 


223.28 




mUSQieS 


32 of 39 


7 of 39 


147.91 


200.14 


2^.77 




iviyofnetnum 


98 of 106 


8of106 


248.04 


328.05 


407.02 




v/i]R;lUlJiil 


oof 15 


7of15 


88.63 


179.55 


196.14 




v/vaiy 


71 of 74 


3 of 74 


306.34 


384.95 


435.31 




Pancreas 


in rvf ^ 
lU Ol 0*t 


^4 01 34 


78.67 


128.72 


16220 




Placenta 


1of5 


4 of 5 


79.06 


108.81 


11?,?2 




Prostate 


23 of 32 


9 Of 32 


181.76 


193.51 


230.41 




Rectum 


28 of 43 


15 of 43 


113.19 


126.40 


167.84 




Right Atrium 


82 of 169 


87 Of 169 


134.58 


166.54 


208.73 




WghtVentride 


100 of 160 


60 of 160 


122.83 


165.44 


216.16 




Skin 


49 of 59 


10 of 59 


185.75 


236.37 


297.31 




Small Intestine 


33 of 68 


35 of 68 


69.62 


109.95 


142.30 




Soft Tissues 


5 of 6 


1of6 


186.44 


21450 


260.89 




Spleen 


22 of 31 


9 of 31 


149.56 


184.29 


215.47 




Stomach 


20 of 47 


27 of 47 


77.44 


114.84 


158.63 




Testis 


5of5 


OofS 


39628 


429.08 


448.94 
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Global 
Presort 


Tissue 


Present 


Absent 


ljowin'25% 


Median 


Upper 75% 




Thymus 


70 of 71 


1of71 


296.44 


348.71 


400.48 




ThyroU^and 


13 of 18 


5of18 


228.07 


310.09 


350.55 




Uterus 


53 of 58 


5 of 58 


249.90 


318.35 


361.58 




WBC 


15 of 40 


25 of 40 


68.93 


94.14 


122.44 



INDUSTRIAL APPLICABILITY 

5 Example 3 

Detection of LBFL302 and 303 mRNA for Liver Cancer Screening 

The expression level of mRNA corresponding to SEQ ID NO: 1, 3, 5, 7 or 9 is 
determined in liver tissue biopsy samples, as described in Example 1, i.e., by screening 
mRNA samples on a GeneChip, or as described in Example 2, /.e, by screening mRNA 

10 samples on a Northern blot, Altematively, samples from non-liver hyperplastic tissues 
in malignant or non-malignant states may also be analyzed. Liver tissue samples from 
patients with liver cancer and from normal subjects may be used as positive and negative 
controls. Using any means of assaying gene expression, a level of expression higher 
than that of the normal control is indicative of liver cancer or a likelihood of developing 

15 liver cancer. 

Although the present invention has been described in detail with reference to 
examples above, it is imderstood that various modifications can be made without 
departing from the spirit of the invention. Accordingly, the invention is limited only by 
20 the following claims. All cited patrats, patent applications and publications referred to 
in this application are herein incorporated by reference in their entirety. 
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What Is Claimed: 

L An isolated nucleic acid molecule selected from the group consisting of: (a) an 
isolated nucleic acid molecule comprising SEQ ID NO: 1, 3, 5 or 7, (b) an isolated 

5 nucleic acid molecule encoding SEQ ID NO: 6 or 8, (c) an isolated nucleic acid molecule 
that encodes a protein that is caressed in liver cancer and that exhibits at least ahout 
95% nucleotide sequence identity over the entire contiguous sequence of SEQ ID NO: 5, 
(d) an isolated nucleic acid molecule that encodes a protein that is expressed in liver 
cancer and that exhibits at least about 75% nucleotide sequence identity over the entire 

10 contiguous sequence of SEQ ID NO: 7, and (e) an isolated nucleic acid moleciole 
comprising the complement of a nucleic acid molecule of (a), (b), (c) or (d). 

2. The isolated nucleic acid molecule of claim 1, wherein the nucleic acid molecule 
consists of nucleotides 155-418 of SEQ ID NO: 1. 

15 

3. The isolated nucleic acid molecule of claim 1, wherein the nucleic acid molecule 
consists of nucleotides 139-402 of SEQ ID NO: 3. 

4. The isolated nucleic molecule of claim 1, wherein the nucleic acid molecule 
20 comprises nucleotides 32-1384 of SEQ ID NO: 5. 

5. The isolated nucleic molecule of claim 1, wherein the nucleic acid molecule 
comprises nucleotides 32-1387 of SEQ ID NO: 5. 

25 6. The isolated nucleic acid molecule of claim 1, wherein the nucleic acid molecule 
consists of nucleotides 32-1384 of SEQ ID NO: 5. 

7. The isolated nucleic molecule of claim 1, wherein the nucleic acid molecule 
comprises nucleotides 41-1501 of SEQ ID NO: 7. 

30 
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8. The isolated nucleic molecule of claim 1, wherein the nucleic acid molecule 
comprises nucleotides 41-1504 of SEQ ID NO: 7. 

9. The isolated nucleic acid molecule of claim 1, wherein the nucleic acid molecule 
5 consists of nucleotides 41-1501 of SEQ ID NO: 7. 

10. The isolated nucleic acid molecule of any one of claims 1-9, wherein said nucleic 
acid molecule is operably linked to one or more expression control elements. 

10 1 1 . A vector comprising an isolated nucleic acid molecule of any one of claims 1 -9. 

12. A host cell transformed to contain the nucleic acid molecule of any one of claims 1- 
9. 

15 13. A host cell comprising a vector of claim 1 1 . 

14. A host cell of claim 13, wherein said host cell is selected from the group consisting 
of prokaryotic host cells and eukaryotic host cells. 

20 15. A method for producing a polypeptide comprising culturing a host cell transformed 
with the nucleic acid molecule of any one of claims 1-9 under conditions in which the 
protein encoded by said nucleic acid molecule is expressed. 

16. The method of claim 15, wherein said host cell is selected from the group 
25 consisting of prokaryotic host cells and eukaryotic host cells. 

17. An isolated polypeptide produced by the method of claim 15. 

18. An isolated polypeptide or protein selected from the group consisting of an isolated 
30 polypeptide comprising the amino acid sequence of SEQ ID NO: 2, 4, 6 or 8, an isolated 
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polypeptide comprising a fragment of at least 10 amino acids of SEQ BO NO: 2 or 4, an 
isolated polypeptide comprising conservative amino acid substitutions of SEQ ID NO; 2 
or 4, an isolated polypeptide comprising naturally occurring amino acid sequence 
variants of SEQ ID NO: 2 or 4, au isolated polypeptide exhibiting at least about 75% 
amino acid sequence identity with SEQ ID NO: 2 or 4, and a protein having at least 
about 95% amino acid sequence identity with SEQ ID NO: 6 or 8. 

19. An isolated antibody or autigen-binding antibody fragment that binds to a 
polypeptide of claim 18. 

20. An antibody of claim 19, wherein said antibody is a monoclonal or a polyclonal 
antibody. 

21 . A method of identifying an agent which modulates the expression of a nucleic acid 
encoding a protein of claim 18, comprising: 

exposing cells which express the nucleic acid to the agent; and 

determining whether the agent modulates egression of said nucleic acid, thereby 

identifying an agent which modulates the expression of a nucleic acid aicoding the 

protein. 

22. A method of identifying an agent which modulates the level of or at least one 
activity of a protein of claim 18 or a protein comprising SEQ ID NO: 10, comprising: 

exposing ceUs which e}q)ress the protein to the agent; 

determining wheflier the agent modulates the level of or at least one activity of said 
protein, th^eby identifying an agent which modulates the level of or at least one activity 
of the protein. 

23. The method of claim 22, wherein the agent modulates one activity of the protein. 

24. A method of idaitifying binding partners for a protein of claim 18 or a protein 
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comprising SBQ ID NO: 10, comprising: 

exposing said protein to a potential binding partner; and 

determining if the potential binding partner binds to said protein, thereby 
identifying binding partners for the protein. 

5 

25. A method of modulating the expression of a nucleic acid encoding a protein of 
claim 18 or a protein comprising SEQ ID NO: 10, comprising: 

administering an effective amount of an agent which modulates the expression of a 
nucleic acid encoding the protein. 

10 

26. A method of modulating at least one activity of a protem of claim 18 or a protein 
comprising SEQ ID NO: 10, comprising: 

administering an effective amount of an agent which modulates at least one activity 
of the protein. 

15 

27. A non-human transgenic animal modified to contain a nucleic acid molecule of any 
of claims 1-9 or SEQ ID NO: 10. 

28. The transgenic animal of claim 27, wherein the nucleic acid molecule contains a 
20 mutation that prevents expression of the encoded protein. 

29. A method of diagnosing a disease state in a subject, comprising: 

determining the level of expression of a nucleic acid molecule or protein of any one 
of claims 1-9 or 18, a nucleic acid comprising SEQ ID NO: 9 or a protein molecule 
25 comprising SEQ ID NO: 10. 

30. The method of claim 29, wherein the disease state is Uver cancer. 

3 1 . The method of claim 30, wherein the disease state is hepatocellular carcinoma. 

30 
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32. The method of claim 29, wherein fhe disease state is a malignant neoplasm. 

33. The method of claim 32, wherein the malignant neoplasm occurs in the bladder, 
breast, cervix, colon, kidney, lung, myometrium, ovary, pancreas, prostate, rectum skin, 

5 small intestine, soft tissue, spleen, stomach, testis or thyroid gland. 

34. A composition comprising a diluent and a polypeptide or protein selected firom the 
group consisting of an isolated polypeptide comprising the anaino acid sequence of SEQ 
ID NO: 2, 4, 6, 8 or 10, an isolated polypeptide comprising a fragment of at least 10 

10 amino acids of SEQ ID NO: 2, 4, 6, 8 or 10, an isolated polypeptide comprising 
conservative amino acid substitutions of SEQ ID NO: 2 or 4, an isolated polypeptide 
comprising naturally occurring amino acid sequence variants of SEQ ID NO: 2 or 4, an 
isolated polypeptide exhibiting at least about 75% amino acid sequence identity with 
SEQ ID NO: 2 or 4, and a polypeptide exhibiting at least about 95% amino acid 

15 sequence identity with SEQ ID NO: 6, 8 or 10. 
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<110> 



LG Life Sciences/ Ltd. 



<120> 



Gene families associated with liver cancer 



<130> PC03016-LG 

<150> US 60/402,905 
<151> 2002-08-14 

<150> US 60/403,651 
<151> 2002-08-16 

<160> 10 

<170> Kopatentin 1-71 

<210> 1 

<211> 578 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (155) . . (418) 

<223> Gene LBFL302, Clone BC4 

<400> 1 

cggacgcgtg ggttcgaacg ttcggactga ggtttttctg cctgaagaag cgtcatacgg 60 

accggattgt tttcgctggc ccagtgtccc cggagcttgt gtgcgataca gagagcacct .120 

cggaagctga ggcagctggt acttgacaga gagg atg gcg ctg teg acc 169 



Met Ala Leu Ser Thr 



1 



5 



ata gtc tec cag agg aag cag ata aag egg aag get ccc cgt ggc ttt 
lie Val Ser Gin Arg Lys Gin He Lys Arg Lys Ala Pro Arg Gly Phe 
10 15 20 



217 
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eta aag cga gtc ttc aag cga aag aag cct caa ctt cgt ctg gag aaa 265 
Leu Lys Arg Val Phe Lys Arg Lys Lys Pro Gin Leu Arg Leu Glu Lys 
25 30 35 



agt ggt gac tta ttg gtc cat ctg aac tgt tta ctg ttt gtt cat cga 313 
Ser Gly Asp Leu Leu Val His Leu Asn Cys Leu Leu Phe Val His Arg 
40 45 50 

tta gca gaa gag tec agg aca aac get tgt gcg agt aaa tgt aga gtc 361 
Leu Ala Glu Glu Ser Arg Thr Asn Ala Cys Ala Ser Lys Cys Arg Val 
55 60 65 

att aac aag gag cat gta ctg gcc gca gca aag gta att eta aag aag 409 
He Asn Lys Glu His Val Leu Ala Ala Ala Lys Val He Leu Lys Lys 
70 75 80 85 



age aga ggt ta gaagtcaaag aacatattct tgaaagttat gatgcattct 460 

Ser Arg Gly 

tttgggtggt aacagatcat aaagacattt tttacacatc agttaatatg ggattattaa 520 

atattggcta taaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 578 



<210> 2 

<211> 88 

<212> PRT 

<213> Homo sapiens 

<400> 2 

Met Ala Leu Ser Thr He Val Ser Gin Arg Lys Gin He Lys Arg Lys 
1 . 5 • * 10 • 15 " 

Ala Pro Arg Gly Phe Leu Lys Arg Val Phe Lys Arg Lys Lys Pro Gin 
20 25 30 

Leu Arg Leu Glu Lys Ser Gly Asp Leu Leu Val His Leu Asn Cys Leu 
35 40 45 

Leu Phe Val His Arg Leu Ala Glu Glu Ser Arg Thr Asn Ala Cys Ala 
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50 55 60 

Ser Lys Cys Arg Val He Asn Lys Glu His Val Leu Ala Ala Ala Lys 
65 70 75 80 

Val He Leu Lys Lys Ser Arg Gly 
85 



<210> 3 

<211> 531 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (139) . . (402) 

<223> Gene LBFL302, Clone BC7 



<400> 3 

cccacgcgtc cggaggtttt tctgcctgaa gaagcgtcat acggaccgga ttgttttcgc 



60 



tggcccagtg tccccggagc ttgtgtgcga tacagagagc acctcggaag ctgaggcagc 120 

tggtacttga cagagagg atg gcg ctg teg acc ata gtc tec eag agg aag 171 

Met Ala Leu Ser Thr He Val Ser Gin Arg Lys 
15 10 

cag ata aag egg aag get ccc cgt ggc ttt eta aag ega gte ttc aag 219 
Gin He Lys Arg Lys Ala Pro Arg Gly Phe Leu Lys Arg Val Phe Lys 
15 20 25 , 

eta aag aag eet eaa ctt egt ctg gag aaa agt ggt gac tta ttg gtc 267 
Leu Lys Lys Pro Gin Leu Arg Leu Glu Lys Ser Gly Asp Leu Leu Val 
30 35 40 

cat ctg aac tgt tta ctg ttt gtt cat ega tta gea gaa gag tec agg 315 
His Leu Asn Cys Leu Leu Phe Val His Arg Leu Ala Glu Glu Ser Arg 
45 50 S5 
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aca aac get tgt gcg agt aaa tgt aga gtc att aac aag gag cat gta 363 
Thr Asn Ala Cys Ala Ser Lys Cys Arg Val He Asn Lys Glu His Val 
60 65 70 75 

ctg gcc gca gca aag gta att eta aag aag age aga ggt tagaagtc 410 

Leu Ala Ala Ala Lys Val He Leu Lys Lys Ser Arg Gly 
80 85 

aaagaacata ttcttgaaag ttatgatgca ttcttttggg tggtaacaga tcataaagac 470 

attttttaca catcagttaa tatgggatta ttaaatattg gatataaaaa aaaaaaaaaa 530 

a . 531 



<210> 4 

<211> 88 

<212> PRT 

<213> Homo sapiens 

<400> 4 

Met Ala Leu Ser Thr He Val Ser Gin Arg Lys Gin He Lys Arg Lys 
15 10 15 

Ala Pro Arg Gly Phe Leu Lys Arg Val Phe Lys Leu Lys Lys Pro Gin 
20 25 30 

Leu Arg Leu Glu Lys Ser Gly Asp Leu Leu Val His Leu Asn Cys Leu 
35 40 45 

Leu Phe Val . His Arg Leu Ala Glu Glu Ser Arg Thr Asn Ala Cys Ala 
50 55 60 

Ser Lys Cys Arg Val He Asn Lys Glu His Val Leu Ala Ala Ala Lys 
65 70 75 80 

Val He Leu Lys Lys Ser Arg Gly 
85 
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<210> 5 

<211> 2067 

<212> DMA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (32) . . (1384) 

<223> Clone GE6 



<400> 5 

cccgggctgc caggcgccca gctgtgccca g atg gat ggg aca gag 

Met Asp Gly Thr Glu 
1 5 

acc egg cag egg agg ctg gac age tgt ggc aag oca ggg gag ctg ggg 
Thr Arg Gin Arg Arg Leu Asp Ser Cys Gly Lys Pro Gly Glu Leu Gly 
10 15 20 

ctt cet cac ecc etc age aca gga gga etc cct gta gcc tea gaa gat 
Leu Pro His Pro Leu Ser Thr Gly Gly Leu Pro Val Ala Ser Glu Asp 
25 30 35 

gga get etc agg gcc cct gag age caa age gtg acc ece aag eea ctg 
Gly Ala Leu Arg Ala Pro Glu Ser Gin Ser Val Thr Pro Lys Pro Leu 
40 45 50 

gag act gag ect age agg gag acc gee tgg tee ata ggc ett eag gtg 
Glu Thr Glu Pro Ser Arg Glu Thr Ala Trp Ser lie Gly Leu Gin Val 
. 55 60 65 

acc gtg ecc ttc atg ttt gea ggc ctg gga ctg tec tgg gcc ggc atg 
Thr Val Pro Phe Met Phe Ala Gly Leu Gly Leu Ser Trp Ala Gly Met 
70 75 80 85 



ett ctg gae tat ttc 
Leu Leu Asp Tyr Phe 
90 



cag gee aac act 
Gin Ala Asn Thr 



gga caa att gat gae ece cag 
Gly Gin lie Asp Asp Pro Gin 
95 100 
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gag cag cac aga gtc ate age age aac etg gee etc ate cag gtg cag 382 
Glu Gin His Arg Val lie Ser Ser As n Leu Ala Leu He Gin Val Gin 
105 110 115 

gee act gte gtg ggg cte ttg get get gtg get gcg ctg etg ttg ggc 430 
Ala Thr Val Val Gly Leu Leu Ala Ala Val Ala Ala Leu Leu Leu Gly 
120 125 130 

gtg gtg tct cga gag gaa gtg gat gte gee aag gtg gag ttg ctg tgt 478 
Val Val Ser Arg Glu Glu Val Asp Val Ala Lys Val Glu Leu Leu Cys 
135 140 145 

gee age agt gtc etc act gee ttc ctt gca gcc ttt gee ctg ggg gtg 526 
Ala Ser Ser Val Leu Thr Ala Phe Leu Ala Ala Phe Ala Leu Gly Val 
150 155 160 165 

etg atg gtc tgt ata gtg att ggt get cga aag etc ggg gtc aac cca 574 
Leu Met Val Cys He Val He Gly Ala Arg Lys Leu Gly Val Asn Pro 
170 175 180 

gac aac att gcc acg ccc att gca gcc age ctg gga gac etc ate aca 622 
Asp Asn He Ala Thr Pro He Ala Ala Ser Leu Gly Asp Leu He Thr 
185 190 195 

ctg tec att ctg get ttg gtt age age ttc ttc tac aga cac aaa gat 670 
Leu Ser He Leu Ala Leu Val Ser Ser Phe Phe Tyr Arg His Lys Asp 
200 205 210 

agt egg tat etg acg ccg ctg gtc tge etc age ttt gcg get etg ace 718 
Ser Arg Tyr Leu Thr Pro Leu Val Cys Leu Ser Phe Ala Ala Leu Thr 
215 220 225 

cca gtg tgg gte ete att gee aag cag age eca eec ate gtg aag ate 766 
Pro Val Trp Val Leu He Ala Lys Gin Ser Pro Pro He Val Lys He 
230 235 240 245 

ctg aag ttt gge tgg ttc cca ate ate etg gee atg gte ate age agt 814 
Leu Lys Phe Gly Trp Phe Pro He He Leu Ala Met Val He Ser Ser 
250 255 260 
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910 



958 



ttc gga gga etc ate ttg age aaa aec gtt tct aaa cag cag tac aaa 862 
Phe Gly Gly Leu He Leu Ser Lys Thr Val S.er Lys Gin Gin Tyr Lys 
265 270 275 

gge atg geg ata ttt ace eee gtc ata tgt ggt gtt ggt gge aat ctg 
Gly Met Ala lie Phe Thr Pro Val lie Cys Gly Val Gly Gly Asn Leu 
280 285 290 

gtg gcc att cag acc age ega ate tea aec tac ctg cac atg tgg agt 
Val Ala He Gin Thr Ser Arg He Ser Thr Tyr Leu His Met Trp Ser 
295 300 305 

gca act gge gtc ctg ccc etc cag atg aag aaa ttc tgg ccc aac ccg 
Ala Pro Gly Val Leu Pro Leu Gin Met Lys Lys Phe Trp Pro Asn Pro 
310 315 320 325 

tgt tct act ttc tgc acg tea gaa ate aat tec atg tea get ega gtc 
Cys Ser Thr Phe Cys Thr Ser Glu He Asn Ser Met Ser Ala Arg Val 
330 335 340 

ctg etc ttg ctg gtg gtc eca gge eat ctg att ttc ttc tac ate ate 
Leu Leu Leu Leu Val Val Pro Gly His Leu He Phe Phe Tyr He He 
345 350 355 

tac ctg gtg gag ggt cag tea gtc ata aac age cag ace ttt gtg gtg 
Tyr Leu Val Glu Gly Gin Ser Val He Asn Ser Gin Thr Phe Val Val 
360 365 370 

etc tac ctg ctg gca gge ctg ate cag gtg aca ate ctg ctg tac ctg 1198 
Leu Tyr Leu Leu. Ala Gly Leu He Gin Val Thr He Leu Leu Tyr Leu 
375 380 385 

gca gaa gtg atg gtt egg ctg act tgg cac cag gee ctg gat cet gae 124 6 

Ala Glu Val Met Val Arg Leu Thr Trp His Gin Ala Leu Asp Pro Asp 
390 395 400 405 

aac cac tgc ate ccc tac ett aca ggg ctg ggg gae ctg etc ggt act 1294 
Asn His Cys He Pro Tyr Leu Thr Gly Leu Gly Asp Leu Leu Gly Thr 
410 415 420 



1006 



1054 



1102 



1150 



-7- 



wo 2004/016637 ^^T/KR2003/001655 

Sequence Listing 



ggc etc ctg gca etc tgc ttt ttc act gac tgg eta ctg aag age aag 1342 
Gly Leu Leu Ala Leu Cys Phe Phe Thr Asp Trp Leu Leu Lys Ser Lys 
425 430 435 

gca gag ctg ggt ggc ate tea gaa ctg gca tct gga cct ccc taactg 1390 
Ala Glu Leu Gly Gly lie Ser Glu Leu Ala Ser Gly Pro Pro 
440 445 450 

ggccccgctg gteccatttg etcattagaa tttcctctea cateagtggg ataeagaatt 1450 

eagtttctcc cttgccaggt ccttgggatg gttgacccct gcctctgcag tagccttttg 1510 

tgagtctgct aaggtagctc tcacacacct cggctctggg gttgatacct gagcetgcaa 1570 

tagagccctg aaatcaagag catggcttga gtgtgtgaat atgatgtgtg cacatgctta 1630 

atgagcgtgc aagtgtgcac acgtttgtgg agaggagggt gttctggcct gagaaggtaa 1690 

agaagaggea tgtccagtat getttgcagg gtgtgtttgc tcttttccat gcccatgcaa 17.50 

cceagattgg ggtggagcag gaaggagctc ttttctgttc ccaagcetca gaactcttga 1810 

gctgtggctt acttgctgtc ttcaccaggt tcaagctccg tgggccacac tgctgctgtg • 1870 

ccaagaaggt gtacagcctc eccaggatgg ggcctcatac aacccttcat ctgcactcaa 1930 

catttaateg tgtcettgct gtctttttat tttccttttt gtttgttagc aaaaacctet 1990 

atttagattt caataatcag agaagtgtaa aataaaacag attatattgt aaaaaaaaaa 2050 

aaaaaaaaaa aaaaaaa 2067 



<210> 6 

<211> 451 

<212> PRT 

<213> Homo sapiens 

<400> 6 
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Met Asp Gly Thr Glu Thr Arg Gin Arg Arg Leu Asp Ser Cys Gly Lys 
1 5 10 15 

Pro Gly Glu Leu Gly Leu Pro His Pro Leu Ser Thr Gly Gly Leu Pro 
20 25 30 

Val Ala Ser Glu Asp Gly Ala Leu Arg Ala Pro Glu Ser Gin Ser Val 
35 40 45 

Thr Pro Lys Pro Leu Glu Thr Glu Pro Ser Arg Glu Thr Ala Trp Ser 
50 55 60 

lie Gly Leu Gin Val Thr Val Pro Phe Met Phe Ala Gly Leu Gly Leu 
65 70 75 80 

Ser Trp Ala Gly Met Leu Leu Asp Tyr Phe Gin Ala Asn Thr Gly Gin 
85 90 • 95 

lie Asp Asp Pro Gin Glu Gin His Arg Val lie Ser Ser Asn Leu Ala 
100 105 110 

Leu He Gin Val Gin Ala Thr Val Val Gly Leu Leu Ala Ala Val Ala 
115 120 125 

Ala Leu Leu Leu Gly Val Val Ser Arg Glu Glu Val Asp Val Ala Lys 
130 135 140 

Val Glu Leu Leu Cys Ala Ser Ser Val Leu Thr Ala Phe Leu Ala Ala 
145 . 150 155 160 

Phe Ala Leu Gly Val Leu Met Val Cys He Val He Gly Ala Arg Lys 
. 165 170 175 

Leu Gly Val Asn Pro Asp Asn He Ala Thr Pro He Ala Ala Ser Leu 
180 185 190 

Gly Asp Leu He Thr Leu Ser He Leu Ala Leu Val Ser Ser Phe Phe 
195 200 205 

Tyr Arg His Lys Asp Ser Arg Tyr Leu Thr Pro Leu Val Cys Leu Ser 
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210 215 220 

Phe Ala Ala Leu Thr Pro Val Trp Val Leu lie Ala Lys Gin Ser Pro 
225 ■ 230 235 240 

Pro He Val Lys He Leu Lys Phe Gly Trp Phe Pro He He Leu Ala 
245 250 255 

Met Val He Ser Ser Phe Gly Gly Leu He Leu Ser Lys Thr Val Ser 
260 265 270 

Lys Gin Gin Tyr Lys Gly Met Ala He Phe Thr Pro Val He Cys Gly 
275 280 285 

Val Gly Gly Asn Leu Val Ala He Gin Thr Ser Arg He Ser Thr Tyr 
290 295 300 

Leu His Met Trp Ser Ala Pro Gly Val Leu Pro Leu Gin Met Lys Lys 
305 310 315 320 

Phe Trp Pro Asn Pro Cys Ser Thr Phe Cys Thr Ser Glu He Asn Ser 
325 330 335 

Met Ser Ala Arg Val Leu Leu Leu Leu Val Val Pro Gly His Leu He 
340 345 350 

Phe Phe Tyr He He Tyr Leu Val Glu Gly Gin Ser Val He Asn Ser 
355 360 365 

Gin Thr Phe Val Val Leu Tyr Leu Leu Ala Gly Leu He Gin Val Thr 
370 375 380 

He Leu Leu Tyr Leu Ala Glu Val Met Val Arg Leu Thr Trp His Gin 
385 390 395 400 

Ala Leu Asp Pro Asp Asn His Cys He Pro Tyr Leu Thr Gly Leu Gly 
405 410 415 

Asp Leu Leu Gly Thr Gly Leu Leu Ala Leu Cys Phe Phe Thr Asp Trp 
420 425 430 
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Leu Leu Lys Ser Lys Ala Glu Leu Gly Gly He Ser Glu Leu Ala' Ser 
435 440 445 

Gly Pro Pro 
450 



<210> 
<211> 
<212> 
<213> 



7 

2178 
DNA 

Homo sapiens 



<220> 
<221> 
<222> 
<223> 



CDS 

(41).. (1501) 
Clone MBS 



<400> 7 

gggaggggac ccgggctgcc aggcgcccag ctgtgcccag atg gat ggg aca gag 

Met Asp Gly Thr Glu 
1 5 



55 



acc egg cag egg agg ctg gac age tgt ggc aag cca ggg gag ctg ggg 
Thr Arg Gin Arg Arg Leu Asp Ser Cys Gly Lys Pro Gly Glu Leu Gly 
10 15 20 



103 



ctt cct cac ccc etc age aea gga gga ete ect gta gee tea gaa gat 
Leu Pro His Pro Leu Ser Thr Gly Gly Leu Pro Val Ala Ser Glu Asp 
25 30 35 



151 



gga get cte agg gee eet gag age eaa age gtg aec ecc aag cea ctg 
Gly Ala Leu Arg Ala Pro Glu Ser Gin Ser Val Thr Pro Lys Pro Leu 
40 45 50 

gag act gag eet age agg gag aee gee tgg tee ata gge ett .eag gtg 
Glu Thr Glu Pro Ser Arg Glu Thr Ala Trp Ser He Gly Leu Gin Val 
55 60 65 



199 



247 
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acc gtg ccc ttc atg ttt gca ggc ctg gga ctg tec tgg gcc ggc atg 295 
Thr Val Pro Phe Met Phe Ala Gly Leu Gly Leu Ser Tirp Ala Gly Met 
70 . 75 80 85 

ctt ctg gac tat ttc cag cac tgg cct gtg ttt gtg gag gtg aaa gac 343 
Leu Leu Asp Tyr Phe Gin His Trp Pro Val Phe Val Glu Val Lys Asp 
90 95 100 

ctt ttg aca ttg gtg ccg ccc ctg gtg ggc ctg aag ggg aac ctg gag 391 
Leu Leu Thr Leu Val Pro Pro Leu Val Gly Leu Lys Gly Asn Leu Glu 
105 110 115 

atg aca ctg gca tec aga etc tec aca get gcc aac act gga caa att 439 
Met Thr Leu Ala Ser Arg Leu Ser Thr Ala Ala Asn Thr Gly Gin lie 
120 125 • 130 

gat gac ccc cag gag cag cac aga gtc ate age age aac ctg gcc etc 487 
Asp Asp Pro Gin Glu Gin His Arg Val He Ser Ser Asn Leu Ala Leu 
135 140 145 

ate cag gtg cag gcc act gtc gtg ggg etc ttg get get gtg get gcg 535 
He Gin val Gin Ala Thr Val Val Gly Leu Leu Ala Ala Val Ala Ala 
150 155 160 165 

ctg ctg ttg ggc gtg gtg tct ega gag gaa gtg gat gtc gee aag gtg 583 
Leu Leu Leu Gly Val Val Ser Arg Glu Glu Val Asp Val Ala Lys Val 
170 175 180 

gag ttg ctg tgt gcc age agt gtc etc act gee ttc ctt gca gee ttt 631 
Glu Leu Leu Cys Ala Ser Ser Val Leu Thr Ala Phe Leu Ala Ala Phe 
185 190 195 

gcc ctg ggg gtg ctg atg gtc tgt ata gtg att ggt get ega aag etc 679 
Ala Leu Gly Val Leu Met Val Cys He Val He Gly Ala Arg Lys Leu 
200 205 210 

ggg gtc aac cea gac aac att gee acg ccc att gca gee age ctg gga *727 
Gly val Asn Pro Asp Asn He Ala Thr Prp He Ala Ala Ser Leu Gly 
215 220 225 
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gac etc ate aca ctg tec att ctg get ttg gtt age age ttc ttc tac 
Asp Leu He Thr Leu Ser He Leu Ala Leu Val Ser Ser Phe Phe Tyr 
230 235 240 245 



775 



aga cac aaa gat agt egg tat ctg acg ccg ctg gtc tge etc age ttt 
TVrg His Lys Asp Ser Arg Tyr Leu Thr Pro Leu Val Cys Leu Ser Phe 
250 255 260 



823 



gcg get ctg acc cea gtg tgg gtc etc att gcc aag cag age cea ecc 
Ala Ala Leu Thr Pro Val Trp Val Leu He Ala Lys Gin Ser Pro Pro 
265 270 275 



871 



ate gtg aag ate ctg aag ttt gge tgg ttc cea ate ate ctg gee atg 
He Val Lys He Leu Lys Phe Gly Trp Phe Pro He He Leu Ala Met 
280 285 290 



919 



gtc ate age agt ttc gga gga etc ate ttg age aaa acc gtt tct aaa 
Val He Ser Ser Phe Gly Gly Leu He Leu Ser Lys Thr Val Ser Lys 
295 300 305 



967 



eag cag tac aaa ggc atg gcg ata ttt acc ecc gtc ata tgt ggt gtt 
Gin Gin Tyr Lys Gly Met Ala He Phe Thr Pro Val He Cys Gly Val 
310 315 320 325 



1015 



ggt ggc aat ctg gtg gcc att cag acc age cga ate tea ace tac ctg 
Gly Gly Asn Leu Val Ala He Gin Thr Ser Arg He Ser Thr Tyr Leu 
330 335 340 



1063 



cac atg tgg agt gca cet ggc gtc ctg ecc etc cag atg aag aaa ttc 
His Met Trp Ser Ala Pro Gly Val Leu Pro Leu Gin Met Lys Lys Phe 
345 350 355 



1111 



tgg ecc aac ccg tgt tct act ttc tge acg tea gaa ate aat tec atg 
Trp Pro Asn Pro Cys Ser Thr Phe Cys Thr Ser Glu He Asn Ser Met 
360 365 370 



1159 



tea get cga gtc ctg etc ttg ctg gtg gtc cea ggc cat ctg att ttc 
Ser Ala Arg Val Leu Leu Leu Leu Val Val Pro Gly His Leu He Phe 
375 380 385 



1207 



-13- 



wo 2004/016637 ^^r/KR2003/0016S5 

Sequence Listing 



ttc tac Btc ate tac ctg gtg gag ggt cag tea gtc ata aac age eag 
Phe Tyr lie lie Tyr Leu Val Glu Gly Gin Ser Val lie Asn Ser Gin 
390 395 400 405 

acc ttt gtg gtg cte tac ctg ctg gca ggc ctg ate cag gtg aca ate 
Thr Phe Val Val Leu Tyr Leu Leu Ala Gly Leu lie Gin Val' Thr lie 
410 415 420 

ctg ctg tac ctg gca gaa gtg atg gtt egg ctg act tgg cac cag gee 
Leu Leu Tyr Leu Ala Glu Val Met Val Arg Leu Thr Trp His Gin Ala 
425 430 435 

ctg gat cct gac aac cac tgc ate ccc tac ctt aca ggg ctg ggg gac 
Leu Asp Pro Asp Asn His Cys lie Pro Tyr Leu Thr Gly Leu Gly Asp 
440 445 450 

ctg etc ggt act ggc etc ctg gca etc tgc ttt ttc act gac tgg eta 
Leu Leu Gly Thr Gly Leu Leu Ala Leu Cys Phe Phe Thr Asp Trp Leu 
455 460 465 

ctg aag age aag gca gag ctg ggt ggc ate tea gaa ctg gca tet gga 
Leu Lys Ser Lys Ala Glu Leu Gly Gly He Ser Glu Leu Ala Ser Gly 
470 475 480 485 

cct ccc taactggge ccegetggtc ccatttgetc attagaattt ectctcacat 
Pro Pro 

eagtgggata cagaattcag tttcteeett gecaggtcet tgggatggtt gaeccctgec 
tctgeagtag ccttttgtga gtctgetaag gtagetctca cacacctcgg ctctggggtt 
gatacctgag cetgeaatag agecctgaaa teaagagcat ggcttgagtg tgtgaatatg 
atgtgtgcac atgcttaatg agcgtgcaag tgtgcacaeg tttgtggaga ggagggtgtt 
ctggcetgag aaggtaaaga agaggeatgt ecagtatget ttgcagggtg tgtttgctct 
tttccatgcc eatgcaaccc agattggggt ggagcaggaa ggagctcttt tctgttccca 
agcctcagaa ctcttgagct gtggcttact tgctgtcttc accaggttea agetccgtgg 



1255 



1303 



1351 



1399 



1447 



1495 



1550 

1610 
1670 
1730 
1790 
1850 
1910 
1970 
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gccacactgc tgctgtgcca agaaggtgta cagcctcccc aggatggggc ctcatacaac 2030 

ccttcatctg cactcaacat ttaatcgtgt ccttgctgtc tttttatttt cctttttgtt 2090 

tgttagcaaa aacctctatt tagatttcaa taatcagaga agtgtaaaat aaaacagatt 2150 

atattgtact tgaaaaaaaa aaaaaaaa 2178 

<210> 8 

<211> 487 

<212> PRT 

<213> Homo, sapiens 

<400> 8 

Met Asp Gly Thr Glu Thr Arg Gin Arg Arg Leu Asp Ser Cys Gly Lys 
15 10 15 

Pro Gly Glu Leu Gly Leu Pro His Pro Leu Ser Thr Gly Gly . Leu Pro 
20 25 30 

Val Ala Ser Glu Asp Gly Ala Leu Arg Ala Pro Glu Ser Gin Ser Val 
35 40 45 

Thr Pro Lys Pro Leu Glu Thr Glu Pro Ser Arg Glu Thr Ala Trp Ser 
50 55 60 

He Gly Leu Gin Val Thr Val Pro Phe Met Phe Ala Gly Leu Gly Leu 
65 70 75 80 

Ser Trp Ala Gly Met Leu Leu Asp Tyr Phe Gin His Trp Pro.Val Phe 
85 90 95 

Val Glu Val Lys Asp Leu Leu Thr Leu Val Pro Pro Leu Val Gly Leu 
100 105 110 

Lys Gly Asn Leu Glu Met Thr Leu Ala Ser Arg Leu Ser Thr Ala Ala 
115 120 125 
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Asn Thr Gly Gin He Asp Asp Pro Gin Glu Gin His Arg Val He Ser 
130 135 140 

Ser Asn Leu Ala Leu He Gin Val Gin Ala Thr Val Val Gly Leu Leu 
145 150 155 160 

Ala Ala Val Ala Ala Leu Leu Leu Gly Val Val Ser Arg Glu Glu Val 
165 170 175 

Asp Val Ala Lys Val Glu Leu Leu Cys Ala Ser Ser Val Leu Thr Ala 
180 185 190 

Phe Leu Ala Ala Phe Ala Leu Gly Val Leu Met Val Cys He Val He 
195 200 ' 205 

Gly Ala Arg Lys Leu Gly Val Asn Pro Asp Asn He Ala Thr Pro He 
210 215 220 

Ala Ala Ser Leu Gly Asp Leu He Thr Leu Ser He Leu Ala Leu Val 
225 230 235 240 

Ser Ser Phe Phe Tyr Arg His Lys Asp Ser Arg Tyr Leu Thr Pro Leu 
245 250 255 

Val Cys Leu Ser Phe Ala Ala Leu Thr Pro Val Trp Val Leu He Ala 
260 265 270 

Lys Gin Ser Pro Pro He Val Lys He Leu Lys Phe Gly Trp Phe Pro 
275 280 285 

He He Leu Ala Met Val He Ser Ser Phe Gly Gly Leu He Leu Ser 
290 295 300. 

Lys Thr Val Ser Lys Gin Gin Tyr Lys Gly Met Ala He Phe Thr Pro 
305 310 315 320 

Val He Cys Gly Val Gly Gly Asn Leu Val Ala He Gin Thr Ser Arg 
325 330 335 

He Ser Thr Tyr Leu His Met Trp Ser Ala Pro Gly Val Leu Pro Leu 
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340 345 350 

Gin Met Lys Lys Phe Trp Pro Asn Pro Cys Ser Thr Phe Cys Thr Ser 
355 360 365 

Glu He Asn Ser Met Ser Ala Arg Val Leu Leu Leu Leu Val Val Pro 
370 375 380 

Gly His Leu He Phe Phe Tyr He He Tyr Leu Val Glu Gly Gin Ser 
385 390 395 400 

Val He Asn Ser Gin Thr Phe Val Val Leu Tyr Leu Leu Ala Gly Leu 
405 410 415 

He Gin Val Thr He Leu Leu Tyr Leu Ala Glu Val Met Val Arg Leu 
420 425 430 

Thr Trp His Gin Ala Leu Asp Pro Asp Asn His Cys He Pro Tyr Leu 
435 440 445 

Thr Gly Leu Gly Asp Leu Leu Gly Thr Gly Leu Leu Ala Leu Cys Phe 
450 455 460 

Phe Thr Asp Trp Leu Leu Lys Ser Lys Ala Glu Leu Gly Gly He Ser 
465 470 475 480 

Glu Leu Ala Ser Gly Pro Pro 
485 



<210> 9 

<211> . 1616 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (31) . . (1551) 

<223> Clone IE4 
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<400> 9 

ccgggctgcc aggcgcccag ctgtgcccag atg gat ggg aca gag acc egg cag 54 

Met Asp Gly Thr Glu Thr Arg Gin 
1 5 

egg agg ctg gac age tgt ggc aag cea ggg gag ctg ggg ett cot cac 102 
Arg Arg Leu Asp S^r Cys Gly Lys Pro Gly Glu Leu Gly Leu Pro His 
10 15 20 

ccc ete age aea gga gga cte cct gta gee tea gaa gat gga get etc 150 
Pro Leu Ser Thr Gly Gly Leu Pro Val Ala Ser Glu Asp Gly Ala Leu 
25 30 35 40 

agg gcc cct gag age caa age gtg ace ccc aag cea ctg gag act gag 198 
Arg Ala Pro Glu Ser Gin Ser Val Thr Pro Lys Pro Leu Glu Thr Glu 
45 50 55 

cct age agg gag acc gcc tgg tec ata ggc ctt cag gtg acc gtg ccc 246 
Pro Ser Arg Glu Thr Ala Trp Ser He Gly Leu Gin Val Thr Val Pro 
60 65 70 

ttc atg ttt gea ggc ctg gga ctg tec tgg gcc ggc atg ctt ctg gac 294 
Phe Met Phe Ala Gly Leu Gly Leu Ser Trp Ala Gly Met Leu Leu Asp 
75 80 85 

tat ttc cag cac tgg cct gtg ttt gtg gag gtg aaa gac ctt ttg aca 342 
Tyr Phe Gin His Trp Pro Val Phe Val Glu Val Lys Asp Leu Leu Thr 
90 95 100 

ttg gtg ccg ccc ctg gtg ggc ctg aag ggg aac ctg gag atg aca ctg 390 
Leu Val Pro Pro Leu Val Gly lieu Lys Gly Asn Leu Glu Met Thr Leu 
105 110 115 120 

gea tec aga ete tee aea get gcc aac act gga caa att gat gac ccc 438 
Ala Ser Arg Leu Ser Thr Ala Ala Asn Thr Gly Gin He Asp Asp Pro 
125 130 135 

cag gag cag cac aga gtc ate age age aac ctg gcc etc ate cag gtg 486 
Gin Glu Gin His Arg Val He Ser Ser Asn Leu Ala Leu He Gin Val 
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140 145 150 

cag gcc act gtc gtg ggg etc ttg get get gtg get geg ctg ctg ttg 534 
Gin Ala Thr Val Val Gly Leu Leu Ala Ala Val Ala Ala Leu Leu Leu 
155 160 165 

ggc gtg gtg tct cga gag gaa gtg gat gtc gcc aag gtg gag ttg ctg 582 
Gly Val Val Ser Arg Glu Glu Val Asp Val Ala Lys Val Glu Leu Leu 
170 175 180 

tgt gcc age agt gtc etc act gcc ttc ctt gca gcc ttt gcc ctg ggg 630 
Cys Ala Ser Ser Val Leu Thr Ala Phe Leu Ala Ala Phe Ala lieu Gly 
185 190 195 200 

gtg ctg atg gtc tgt ata gtg att ggt get cga aag etc ggg gtc aac 678 
Val Leu Met Val Cys lie Val lie Gly Ala Arg Lys Leu Gly Val Asn 
205 210 215 

cca gac aac att gcc aeg ccc att gca gee age ctg gga gac etc ate 726 
Pro Asp Asn He Ala Thr Pro He Ala Ala Ser Leu Gly Asp Leu He 
220 225 230 

aca ctg tec att ctg get ttg gtt age age ttc ttc tac aga cae aaa 774 
Thr Leu Ser He Leu Ala Leu Val Ser Ser Phe Phe Tyr Arg His Lys 
235 240 245 

gat agt egg tat ctg aeg ccg ctg gtc tgc etc age ttt geg get ctg 822 
Asp Ser Arg Tyr Leu Thr Pro Leu Val Cys Leu Ser Phe Ala Ala Leu 
250 255 260 

ace cca gtg tgg gtc etc att gee aag cag age cca ccc ate gtg aag 870 
Thr Pro Val Trp Val Leu He Ala Lys Gin Ser Pro Pro He Val Lys 
265 270 275 280 

ate etg aag ttt ggc tgg ttc cca ate ate ctg gee atg gtc ate age 918 
He Leu Lys Phe Gly Trp Phe Pro He He Leu Ala Met Val He Ser 
285 290 295 

agt ttc gga gga etc ate ttg age aaa ace gtt tct aaa cag cag tac 966 
Ser Phe Gly Gly Leu He Leu Ser Lys Thr Val Ser Lys Gin Gin Tyr 
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300 305 310 

aaa ggc atg gcg ata ttt acc ccc gtc ata tgt ggt gtt ggt ggc aat 1014 
Lys Gly Met Ala lie Phe Thr Pro Val lie Cys Gly Val Gly Gly Asn 
315 320 325 

ctg gtg gcc att cag acc age cga ate tea ace tac ctg eac atg tgg 1062 
Leu Val Ala lie Gin Thr Ser Arg lie Ser Thr Tyr Leu His Met Trp 
330 335 340 

agt gca cct ggc gtc ctg ccc etc cag atg aag aaa ttc tgg ccc aac 1110 
Ser Ala Pro Gly Val Leu Pro Leu Gin Met Lys Lys Phe Trp Pro Asn 
345 350 355 360 

ccg tgt tct act ttc tgc acg tea gaa ate aat tec atg tea get cga 1158 
Pro Cys Ser Thr Phe Cys Thr Ser Glu lie Asn Ser Met Ser Ala Arg 
365 370 375 

gtc ctg etc ttg ctg gtg gtc cca ggc cat ctg att ttc tte tac ate 1206 
val. Leu Leu Leu Leu Val Val Pro Gly His Leu lie Phe Phe Tyr lie 
380 385 390 

ate tac ctg gtg gag ggt cag tea gtc ata aac age cag acc ttt gtg 1254 
He Tyr Leu Val Glu Gly Gin Ser Val He Asn Ser Gin Thr Phe Val 
395 400 405 

gtg etc tac ctg ctg gca ggc ctg ate eag gtg aca ate ctg ctg tac 1302 
Val Leu Tyr Leu Leu Ala Gly Leu He Gin Val Thr He Leu Leu Tyr 
410 415 420 

ctg gca gaa gtg atg gtt egg ctg act tgg cae eag gee ctg gat cet 1350 
Leu Ala Glu Val Met Val Arg Leu Thr Trp His .Gin Ala Leu Asp Pro 
425 430 435 440 

gac aac cae tgc ate ccc tac ett aca ggg ctg ggg gae ctg etc ggt 1398 
Asp Asn His Cys He Pro Tyr Leu Thr Gly Leu Gly Asp Leu Leu Gly 
445 450 455 

tea age tec gtg ggc cae act get get gtg cca aga agg tgt aca gee 1446 
Ser Ser Ser Val Gly His Thr Ala Ala Val Pro Arg Arg Cys Thr Ala 
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460 



465 



470 



tec cca gga tgg ggc etc ata caa cec ttc ate tgc act caa cat tta 
Ser Pro Gly Trp Gly Leu lie Gin Pro Phe lie Cys Thr Gin His Leu 
475 480 485 



1494 



ate gtg toe ttg ctg tct ttt tat ttt cct ttt tgt ttg tta gca aaa 
He Val Ser Leu Leu Ser Phe Tyr Phe Pro Phe Cys Leu Leu Ala Lys 
490 495 500 



1542 



ace tct att tagatttca ataatcagag aagtgtaaaa taaaacagat tatattgtaa 

Thr Ser He 

505 



1600 



aaaaaaaaaa aaaaaa 



1616 



<210> 10 

<211> 507 

<212> PRT 

<213> Homo sapiens 

<400> 10 

Met Asp Gly Thr Glu Thr Arg Gin Arg Arg Leu Asp Ser Cys Gly Lys 
15 10 15 

Pro Gly Glu Leu Gly Leu Pro His Pro Leu Ser Thr Gly Gly Leu Pro 
20 25 30 

Val Ala Ser Glu Asp Gly Ala Leu Arg Ala Pro Glu Ser Gin Ser Val 
35 40 45 

Thr Pro Lys Pro Leu Glu Thr Glu Pro Ser Arg Glu Thr Ala Trp Ser 
50 55 60 

He Gly Leu Gin Val Thr Val Pro Phe Met Phe Ala Gly Leu Gly Leu 
65 70 75 80 

Ser Trp Ala Gly Met Leu Leu Asp Tyr Phe Gin His Trp Pro Val Phe 
85 90 95 
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Val Glu Val Lys Asp Leu Leu Thr Leu Val Pro Pro Leu Val Gly Leu 
100 105 110 

Lys Gly Asn Leu Glu Met Thr Leu Ala Ser Arg Leu Ser Thr Ala Ala 
115 120 125 

Asn Thr Gly Gin He Asp Asp Pro Gin Glu Gin His Arg Val He Ser 
130 135 140 

Ser Asn Leu Ala Leu He Gin Val Gin Ala Thr Val Val Gly Leu Leu 
145 150 155 160 

Ala Ala Val Ala Ala Leu Leu Leu Gly Val Val Ser Arg Glu Glu Val 
165 170 175 

Asp Val Ala Lys Val Glu Leu Leu Cys Ala Ser Ser Val Leu Thr Ala 
180 185 190 

Phe L€^u Ala Ala Phe Ala Leu Gly Val Leu Met Val Cys He Val He 
195 200 205 

Gly Ala Arg Lys Leu Gly Val Asn Pro Asp Asn He Ala Thr Pro He 
210 215 220 

Ala Ala Ser Leu Gly Asp Leu He Thr Leu Ser He Leu Ala Leu Val 
.225 230 235 240 

Ser Ser Phe Phe Tyr Arg His Lys Asp Ser Arg Tyr Leu Thr Pro Leu 
245 250 255 

Val Cys Leu Ser Phe Ala Ala Leu Thr Pro Val Trp Val Leu He Ala 
260 265 270 

Lys Gin Ser Pro Pro He Val Lys He Leu Lys Phe Gly Trp Phe Pro 
275 280 285 

He He Leu Ala Met Val He Ser Ser Phe Gly Gly Leu He Leu Ser 
290 295 300 
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Lys Thr Val Ser Lys Gin Gin Tyr Lys Gly Met: Ala lie Pjie Thr Pro 
305 310 315 320 

Val lie Cys Gly Val Gly Gly Asn Leu Val Ala lie Gin Thr Ser Arg 
325 330 335 

lie Ser Thr Tyr Leu His Met Trp Ser Ala Pro Gly Val Leu Pro Leu 
340 345 350 

Gin Met Lys Lys Phe Trp Pro Asn Pro Cys Ser Thr Phe Cys Thr Ser 
355 360 365 

Glu lie Asn Ser Met Ser Ala Arg Val Leu Leu Leu Leu Val Val Pro 
370 375 380 

Gly His Leu lie Phe Phe Tyr lie lie Tyr Leu Val Glu Gly Gin Ser 
385 390 395 400 

Val lie Asn Ser Gin Thr Phe Val Val Leu Tyr Leu Leu Ala Gly Leu 
405 410 . 415 

lie Gin Val Thr He Leu Leu Tyr Leu Ala Glu Val Met Val Arg Leu 
420 425 430 

Thr Trp His Gin Ala Leu Asp Pro Asp Asn His Cys He Pro Tyr Leu 
435 440 445 

Thr Gly Leu Gly Asp Leu Leu Gly Ser Ser Ser Val Gly His Thr Ala 
450 455 460* 

Ala Val Pro Arg Arg Cys Thr Ala Ser Pro Gly Trp Gly Leu He Gin 
465 470 475 480 

Pro Phe He Cys Thr Gin His Leu He Val Ser Leu Leu Ser Phe Tyr 
485 490 495 

Phe Pro Phe Cys Leu Leu Ala Lys Thr Ser He 
500 505 
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